Chemical Informatics Letters

Editor: Jonathan M Goodman

Volumes 1-7

Volume 7

FDA: library of chemical information (July 2003)
The US Food and Drug Administration now provides a library of chemical information.

Isotope Effects Toolkit (July 2003)
This web site from the Technical University of Lodz provides a suite of programs for calculations of kinetic or equilibrium isotope effects using results of major quantum mechanical packages.

INASP (July 2003)
The International Network for the Availability of Scientific Publications (INASP) works to improve the flow of scientific information, especially for countries with less developed systems of publication and dissemination. It was established in 1992 by the International Council for Science. It supports the Programme for the Enhancement of Research Information (PERI) that distributes over 7500 full text electronic journals and databases. In 2002 INASP also conducted a survey of many publishers to determine what if any programs they had available for disseminating scientific information to the developing world.

CSBi (July 2003)
The MIT Computational and Systems Biology Initiative (CSBi) links biology, computer science and engineering in an approach to the analysis of complex biological problems. It is co-chaired by Bruce Tidor and Peter Sorger.

Glossary for Toxicokinetics of Chemicals (July 2003)
This IUPAC glossary (Provisional Recommendation) defines common terms multidisciplinary field of toxicokinetics. It was compiled primarily for chemists working in toxicology and requiring a knowledge of the expressions used in toxicokinetics.

Wikipedia (July 2003)
Wikipedia is a project to create a complete and accurate open content encyclopedia. It was started on January 15, 2001 and there are currently 133412 articles being developed. Other on-line encyclopedias are available which focus on science, including Hyperphysics from Georgia State University, which investigates how physics subjects are linked, and includes connections to chemistry, and ScienceWorld from Wolfram Research - Eric Weisstein's encyclopedia - which includes a section on chemistry.

GIF liberation day (July 2003)
This patent, which covers the GIF image files format, expired on June 20th, 2003. The Free Software Foundation has a comment on this. However, GIF liberation is delayed outside the USA as the counterpart patents in the United Kingdom, France, Germany and Italy expire June 18, 2004, and the Japanese counterpart patents expire June 20, 2004.

JenPep (July 2003)
JenPep is a database of quantitative binding data for immunological protein-peptide interactions, run from the Jenner Institute for Vaccine Research.

JISC-biomedcentral agreement (July 2003)
JISC (the UK Joint Information Systems Committee) has reached an agreement with BioMed Central: Up to 80,000 medical and clinical researchers at 180 universities will now be able to publish their work at no charge in any of BioMed Central's range of online biomedical journals.

MEPNET (July 2003)
MePNet (Membrane Protein Network) is a research network dedicated to G-Protein Coupled Receptors (GPCR) structural genomic technology, run by Bio-Xtal, a proteomic services company.

DBCAT (July 2003)
A catalogue of biological databases, run by INFOBIOGEN, an organisation founded by the French government and the university Evry Val d'Essonne. The project is a collaboration with the EBI which maintains a list of software for molecular biology and genetics: Biocatalog.

viewmol3d (July 2003)
MS Windows OpenGL program, created by Andrew Ryzhkov and Arcady Antipin is destined for visualization of molecules models from quantum chemistry calculations.

ChemBank (August 2003)
ChemBank was mentioned in (Chem. Inf. Lett. 2002, 5, #3, 1). It now has a website, explaining that it is a freely available collection of data about small molecules and resources for studying their properties, especially relating chemistry to biology, and a suite of informatics tools and databases aimed at promoting the development and use of chemical genetics by scientists worldwide. It was described in Science, 2003, 300, 294-5. Chembank currently has a database of about 2000 compounds, with SMILES strings, chemical structures and biological activity. Soon other files will be downloadable.

Clustering Search Results (August 2003)
If a web-search produces a large number of hits, it is useful to be able to cluster them into groups. A new service that does this is Vivisimo. It produces a textual analysis, which might be compared to Kartoo's visualisations. RefViz is a text analysis and visualization package designed to analyze, organize, and facilitate the comprehension of data, which is being used by Thomson, the publishers of the ISI Web of Knowledge to analyse literature.

RSC retrodigitisation (August 2003)
The Royal Society of Chemistry is digitising all its journals from 1841 to 1996. These will be available for outright purchase or for annual lease. The ACS is also digitising its journals although its earliest publications were more recent (1879) and are only available by annual lease. Elsevier has a similar programme going back to 1939, which are available for a one-off fee.

Digital Libraries Initiative, phase 2 (August 2003)
The Digital Libraries Initiative, phase 2 has produced a report on Revolutionizing Science and Engineering through Cyberinfrastructure. The National Digital Library is one of the NSF's projects in this area (Chem. Inf. Lett. 2003, 6, #1, 12). A part of this is iLumina, a digital library of sharable undergraduate teaching materials for chemistry, biology, physics, mathematics, and computer science.

Haystack (August 2003)
Haystack is a tool developed at MIT to manage information in the way that makes the most sense to each user - a platform for authoring end-user semantic web applications. A preliminary release of the software is available, free of charge, but there is no user manual yet.

MERLOT (August 2003)
MERLOT (Multimedia Educational Resource for Learning and Online Teaching) is a free and open resource with links to online learning materials, for many subjects, including over two hundred for chemistry. It is also a variety of grape.

Macromolecular Structure Database Server (August 2003)
The EBI's Macromolecular Structure Database (MSD) has announced new services, including MSDchem, a consistent and enriched library of ligands, small molecules and monomers that are referred to as residues and hetgroups in a PDB entry.

(20) (August 2003)
This web site appears to be an index of molecular modelling and drug design resources and 'softwares'.

NetEquation (August 2003)
NetEquation is a part of ThinkQuest's educational library, and claims to be 'a complete on-line chemistry resource'. It is a well set out introduction to chemistry.

linux4chemistry (August 2003)
This site, part of the World Wide Web virtual library, provides an index to chemistry programs available for the Linux operating system. It continues to be updated, and more than twenty links to programs have been added this year.

Clide (August 2003)
CLiDE is a chemical literature data extraction tool from SymBioSys Inc, a company which started from Professor Peter Johnson's group at Leeds University. It recognises scans of chemical structures and changes them in to structures with chemical sense. The program has been available for some time.

ChemIndex (August 2003)
ChemIndex is the professional's version of Cambridgesoft's ChemFinder - it searches the same database, but provides more facilities for a subscription fee.

Element 110 is named darmstadtium on August 16th 2003 (September 2003)
Element 110, discovered at GSI in Darmstadt, is officially called darmstadtium.

American Mineralogist Crystal Structure Database (September 2003)
A crystal structure database that includes structures published in the American Mineralogist, The Canadian Mineralogist, and the European Journal of Mineralogy. The database is maintained under the care of the Mineralogical Society of America and the Mineralogical Association of Canada, and financed by the National Science Foundation.

Lyx (September 2003)
TeX (LaTeX) made easier and open source. LyX is an open source document processor that encourages an approach to writing based on the structure of your documents, not their appearance. LyX lets you concentrate on writing, leaving details of visual layout to the software. It creates documents in a rigidly structured way, which suits some journals.

Cartage (September 2003)
Cartage (Central Array of Relayed Transaction for the Advance of General Education) allows schools and individuals to share data and resources for the education community in Lebanon and the Middle East. One of the themes is Chemistry.

Directory of Chemical Engineers (September 2003)
This directory of chemical engineering faculty is run by Steve Swinnea from the University of Texas at Austin. This is an easier undertaking than a list of chemists (c2k), as chemical engineering is a smaller subject than chemistry, but an impressive database.

Knowitall (September 2003)
KnowItAll, a database of spectral data, has a free academic edition, which draws structures, analyses IR and Raman spectra, and accesses a multi-technique spectral database with cross-references. It is available for Windows, but not Linux nor Macintosh computers.

World Standards Day (September 2003)
World standards day - will be celebrated on October 14th by ISO, and in the US on September 30th with competition and a banquet.

Prizes for Chemical Informatics (September 2003)
Professor W. Graham Richards (Oxford University) will be awarded the 2004 ACS Award for Computers in Chemical and Pharmaceutical Research (sponsored by Accelrys).

Professor A. Peter Johnson (Leeds University) will be awarded the 2004 Herman Skolnik Award of the ACS Division of Chemical Information, recognizing outstanding contributions to and achievements in the theory and practice of chemical information science.

Organic-Chemistry.Org (September 2003)
This site has a good domain name, and a list of chemistry-related links. It is not clear who runs it, nor what its remit is, although it has links to and advertisements from resources such as Scirus, and it will welcome sponsorship.

Digital Library for Earth (September 2003)
The Digital Library for Earth System Education (DLESE) is a geoscience community resource, funded by the National Science Foundation and is being built by educators, students, and scientists.

Biology Browser (September 2003)
The Biology Browser ("free information from a trusted source"), is provided by Biosis, a non-profit organization that has delivered flexible information services since the 1920s.

Patent Primer for Chemists (and Non-Chemists) from the ACS (September 2003)
This primer, "What Every Chemist Should Know About Patents", is now available from the ACS at a new URL.

The UK Biobank (October 2003)
The UK Biobank project will be the world's biggest resource for the study of the role of nature and nurture in health and disease, The project will follow the health of a large group of volunteers for many years, collecting information on environmental and lifestyle factors and linking these to medical records and biological samples, and anonymised data will be used for research. It is run as a charitable company funded by the Medical Research Council, and the Wellcome Trust. The BBC reports that detailed rules on ethics and governance have now been published.

Dewey Decimal System (October 2003)
The Dewey Decimal system is now more than a hundred years old, and is widely used by libraries across the world. Despite its long use, the system is not in the public domain, but is owned by the OCLC (Online computer library center), a non-profit organisation, and so licenses for its use are available.

Journal of Chemical Information and Computer Science (October 2003)
George A W Milne is retiring as editor of The Journal of Chemical Information and Computer Science (JCICS). William L Jorgensen of Yale University is taking over, and intends to split into the journal into two (C&E News article: subscribers only). The new journals, which may be launched in 2005, may be called The Journal of Chemical Information and Modeling and The Journal of Chemical Theory and Computation.

MDL data interchange format (October 2003)
MDL publishes its file formats and they are widely used. It has now developed a new data interchange format, the XDfile, which is XML-based. This format specification is available on the same web page.

eCheminformatics 2003: 10th-14th November (October 2003)
eCheminformatics 2003 is an international conference to be held on the Internet which brings together researchers to discuss the applications of Cheminformatics methods to Drug Discovery. It is organised by Barry Hardy who has organised numerous international virtual conferences in the area of the chemical, life and medical sciences and is currently chairing the EU-supported Eurotron Virtual Conferences in Glycobiology, manages the Eyeforpharma Knowledge Management Hub and the Cheminformatics Hub and Virtual Conferences.

$10 million NIH grant for proteomics centre (October 2003)
The Pacific Northwest National Laboratory has won a five-year grant from the National Institutes of Health to support a centre for basic research in proteomics. It is the largest NIH grant in the lab's 38-year history.

Biology on ArXiv e-print server (October 2003)
The ArXiv server was a physics server, but now has a (Quantitative) Biology section.

PLoS: Public Library of Science (October 2003)
The first journal in the Public Library of Science: PLoS Biology launched its first issue on October 13, 2003. Unlike most other journals, reading the papers is free, but authors pay $ 1500 to get a paper published. Will this lead to a revolution in the open access to knowledge ( Berlin Declaration)? Soon after its release, the journal attracted so much interest that its webserver could not cope with the demand.

Instructional Resources for Chemistry (October 2003)
This site provides annotated Web links to instructional materials and other resources of interest to Chemistry teachers and course designers, maintained by Steve Lower, a retired member of staff from Simon Fraser University.

Comb-e-chem (October 2003)
Comb-e-Chem is on of the Uk's national e-science projects, run by Jeremy Frey. It is working on Grid-enabled combinatorial chemistry, concentrating on crystallography and laser and surface chemistry, and is also interested in making tea, without a traditional laboratory note book.

Structural Database of Allergenic Proteins (October 2003)
This database is run from the University of Texas Medical Branch. It consists of a Web server that integrates a database of allergenic proteins with various bioinformatics tools for performing structural studies related to allergens and characterization of their epitopes.

General and organic chemistry pages (October 2003)
These pages, designed by Professor Gerard Dupuis at the lycee Faidherbe de Lille, describe a number of chemical reactions and phenomena, including the oscillating Belousov-Zhabotinski reaction.

NMRShiftDB (November 2003)
An open-source, open-content database for chemistry structures and their NMR data, developed at the Max Planck Institute for Chemical Ecology, by Christoph Steinbeck, who is now at the University of Cologne. Submission of data is open, but peer reviewed and the database contains over 5000 spectra - and fewer compounds - submitted by fewer than 20 people. NMRShiftDB allows for searching for (sub-)spectra, (sub-)structures and other properties. The database has been described in a recent publication: J. Chem. Inf. Comp. Sci. 2003, 43, 1733-1739.

Geneinfo (November 2003)
A Genetics and Biotech Information Network, run from the Department of Genetic Engineering at the Jordan University of Science and Technology.

The British Library (November 2003)
The British Library, together with the Cambridge, Oxford, and Trinity College Dublin University libraries and the national libraries of Scotland and Wales are legal deposit libraries, and so publishers are obliged to send a copy of their publications to these libraries on request. This has not covered electronic information, but a new Act will extend this to some non-print information including CD-ROMS and on-line resources. The bill has passed the committee stage and now goes on to the report stage.

Institute for Molecular Manufacturing (IMM) (November 2003)
This non-profit organisation, founded in 1991, carries out research in molecular nanotechnology.

Web links in scholarly publications (November 2003)
Increasingly, scientific publications include URLs amongst their references, but these do not have the same longevity has published papers - indeed they may no longer work by the time the paper is published. A recent paper in Science (Robert P. Dellavalle, et al. Science 2003, 302, #5646, 787-788) showed that 13 % of links were broken 27 months after publication. How will scholars of the future be able to interpret papers depending on links to inaccessible information.

EMBOSS (November 2003)
EMBOSS is the European Molecular Biology Open Software Suite and contains many useful bioinformatics programs, written in C and available for most Unix and related platforms.

Cell Illustrator (November 2003)
Gene Networks and FQS Poland have developed a program called Cell Illustrator - a tool for constructing pathway models and simulating pathway mechanisms of action of both baseline and abnormal conditions.

Sigma Aldrich online NMR and IR spectra (November 2003)
The on-line Sigma Aldrich catalogue includes free access to PDF versions of some spectra for their compounds. Free spectral data is also available from National Institute of Advanced Industrial Science and Technology (AIST) in Japan, which has a database of spectra for organic compounds, Thermo-Galactic, and the NIST Chemistry Web book.

Warr Zone (November 2003)
Dr Wendy Warr's web site has a report on Chemical Information at the 225th ACS national meeting, and a new list of e-Commerce resources for Combinatorial Chemistry and High Throughput Screening.

Wiki (November 2003)
Wiki, the Hawaiian word for quick, is a way of running a web site which can be interactively updated by many people. A Chemistry Encyclopedia is being developed with this technology. Programs to set up a Wiki are available at SourceForge. An underlying MySQL database is presented as web pages, and changes are logged, so it is possible to go back to earlier versions of each page in case something inappropriate is added, or useful information lost.

Problems in scholarly communication (November 2003)
A view from Cornell University Library - journal costs are increasing far faster than inflation. This could be approached by relying less on commercial publishers, and developing new methods for the exchange of scholarly information.

Solubility from NIST (November 2003)
International Union for Pure and Applied Chemistry (IUPAC)-National Institute of Standards and Technology (NIST) Solubility Data Series is now available online and contains over 30 000 solubility measurements.

Molecular Modelling Programs (December 2003)
Last reviewed in Chemical Informatics Letters 2002, 5, #6. This list excludes the major commercial molecular modelling packages and concentrates on programs for which the source code is available in some form and which are available freely or cheaply. Usually there is a license agreement restricting what may be done with the source code.

Originally developed by Peter Kollman, and now maintained by Professor David Case' group at the Scripps Research Institute and collaborators, costs $400 for an academic license, which includes source code.

A molecular mechanics and dynamics program written in C by Professor Robert Harrison at Georgia State's Computer Science Department. It is not clear when the program was last updated, but the web pages appear to have remained unchanged for several years.

B, formerly Biomer; Free; Source Code; Java. Has moved from its old location to Professor David Case' group at the Scripps Research Institute. The page was last updated on 11th October 2002.

A plane wave/pseudopotential implementation of Density Functional Theory. The CPMD group is coordinated by Professor Michele Parrinello (Director of the Swiss Center of Scientific Computations and Professor at the ETH Zuerich) and Dr Wanda Andreoni (Manager of the Computational Material Science Group at IBM Zurich Research Laboratory). An e-mail discussion list is available to discuss the program. Last updated July 2003.

A quantum chemistry program using SCF, MP2, MCSCF or CC wave functions. The strengths of the program are mainly in the areas of magnetic and (frequency-dependent) electric properties, and for studies of molecular potential energy surfaces. Last update in June 2003. The main authors are T. Helgaker, H. J. A. Jensen, P. Jorgensen, J. Olsen, K. Ruud, H. Ågren.

Free software project for atom scale simulation, which will incorporate Molecular Dynamics and Force Fields, Quantum Chemistry and Density Functional Methods. Last updated October 2003.

Ab initio calculations. Martyn Guest, of the Daresbury Laboratory, is the main author. Last updated April 2003.

Ab initio calculations. The program is maintained by Dr Mark Gordon's research group at the Ames Laboratory. Last updated July 2003.

A computational chemistry software package released under the GNU GPL; C++; Linux. Developed in Finland by Tommi Hassinen and collaborators. Last updated December 2002.

Ichmech incorporated in 2001. Free to academics; Last updated June 2001.

Open Source Project; Mainly written in Python,with a small amount of C; Konrad Hinsen, from CNRS Orleans, who is also involved with FSatom (vide supra). Updated in June 2002.

A quantum chemistry package developed by Professor Peter Knowles at Birmingham University and Professor Hans-Joachim Werner at Stuttgart University. Last updated February 2003.

A computational chemistry package that is designed both for workstations and high-performance parallel supercomputers, developed in the William R Wiley Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory, USA. Last updated June 2003.

A free, full source code (Fortran) molecular mechanics and dynamics program, written in the Ponder Lab. Last updated in September 2003.

Viewing and Drawing Molecules (December 2003)
Unlike the previous section, these programs appear only to be for drawing and manipulating molecules and not for doing molecular modelling calculations, although it is sometimes unclear exactly what their capabilities are, and some of them describe themselves as 'molecular modelling programs'. There is index of molecular display programs at the Lawrence Livermore National Laboratory, and a another World Index of modelling and visualisation at the San Diego Supercomputer Centre.

Free program, particularly designed for biomolecules. Last updated July 2002.

An 'extensible molecular modelling system' from the Computer Graphics Laboratory at UCSF, and free of charge for academic use, which appears to be limited to visualisation rather than modelling. Last updated in November 2003.

A 3D visualization program for structural biology data written under X and OpenGL. Last updated in September 2003.

A molecular modelling program that helps chemist to visualize molecules and aids the design of lead drug compounds. Last updated April 2003.

A Finnish tool for the visualization and analysis of molecular structures, written in Tcl/Tk, Last updated in September 2003.

MOLecule analysis and MOLecule display, for biological molecules, written in the group of Professor Kurt Wüthric. Last updated in January 2003.

A protein crystallographic package Last updated in December 2002.

A molecular graphics program. Date of last update is unclear.

An open-source 'molecular modelling system' written in Python. The program appears to be designed to display and manipulate molecules. Last updated November 2003.

Visual Molecular Dynamics, from the Theoretical and Computational Biophysics Group at the University of Illinois at Urbana-Champaign. Last updated November 2003.

HSDB (December 2003)
The Hazardous Substance Databank from the NIH is a factual data file with information on the toxicology and handling procedures for over 4500 molecules. PubMed now has LinkOut access to the HSDB, so it is easy to go from an article to toxicology information

Managing programmers (December 2003)
What is the best way to administer a programming project? Software Reality has an article that argues administrators have become too powerful.

MolSoft (December 2003)
MolSoft from San Diego is, in its own words: "a primary source of new breakthrough technologies in computational chemistry and biology". It has computational environments for molecular modeling, bioinformatics, cheminformatics and ligand docking.

Converting printed chemical structures to connection tables (December 2003)
A program that could reliably take an illustration of a molecule and convert it to a structure which could be analysed by molecular modelling and chemical informatics tools would be extremely useful. However, nothing which does this is in general use, which demonstrates that the problem is an extremely challenging one. Probably the best program available is Clide

Changes for 3D molecules in web sites (December 2003)
The way in which web browsers interpret the tags <object>, <embed> and <applet> is likely to change. More information is available from Microsoft and Apple. This is likely to have a big effect on the display of chemical structures, which often depends on these tags. Instead of just displaying a molecule, the browser will produce a pop-up confirmation window. It is possible to circumvent this problem using Javascript, and the Apple website explains how to do this.

Semantic Web (December 2003)
The Scientific American has an article by Tim Berners-Lee describing the Semantic Web - a vision of how today's World-Wide-Web could develop into something more powerful: "Now, miraculously, we have the Web. For the documents in our lives, everything is simple and smooth. But for data, we are still pre-Web". Based on Resource Description Frameworks software agents might automatically explore the information available gathering useful information.

Free Unix (December 2003)
This year is the twentieth anniversary of the start of the GNU project, initiated by Richard Stallman

Molinspiration (December 2003)
There is a new release of Molinspiration's drugability calculator. This interactive applet allows easy calculation of activity scores for potential GPCR ligands, ion channel modulators and kinase inhibitors.

Authors from five countries banned from ACS journals (December 2003)
The ACS has imposed a moratorium on papers written by authors in Cuba, Iran, Iraq, Libya or the Sudan. (C&E News - subscribers only) because this could be deemed to violate US trade sanctions against these countries. Robert Bovenschulte, who runs the ACS' Publications Division, said the decision was taken with great reluctance and hopes that this will be temporary.

ChemWeb (December 2003)
Elsevier has decided not to continue with its web portals: BioMedNet, ChemWeb and Elsevier Engineering. It is to be hoped that the Chemistry Preprint Server will continue. ChemWeb has provided valuable information in an innovative way over the six years of its existence, and been an important force in the development of new ways of communicating chemistry. It will be missed.

Volume 6

Chemistry 2000 (c2k) (June 2003)
Chemistry 2000 now has a list of countries in order of the number of chemistry departments with WWW servers. USA is top by a considerable amount. Many countries have a broadly consistent definition of chemistry. France is high up the list, with 111 entries, as chemistry departments are relatively rare, so research groups and centres are listed separately. The number of chemists in France that have been found for the list of people is only 88, compared with 429 for Italy and 464 for Canada, which have 45 and 52 departments listed, respectively.

Over the last months, China and India have both moved up the list, and it may be anticipated that they will continue to rise. Some countries use 'edu' to signify educational establishments whilst others use 'ac' for academic organisations. A few countries use both.

There are other lists of chemistry departments, notably ChemDex run by Mark Winter at Sheffield Unversity and the WWW virtual library hosted at Liverpool University as Links for Chemists and run by Michael Barker. All the main indexes of this type are based in the UK, and I am not aware of current competitors in other countries. Both of these have a broader scope than c2k, including many companies as well as chemistry departments. Currently, they include fewer chemistry departments than c2k. Other resources, such as PsiGate have fewer links, but each one has been evaluated and reviewed.

Google is so effective that the need for lists of chemistry departments and chemists may have diminished. However, there are questions for which Google cannot compete. For example, in order to find out about a chemist or department with a common name, or to be sure that a representative sample of departments have been covered in a particular area, or to compare the importance of chemistry in universities in different countries, a Google answer is hard to interpret, and a reasonably complete and up to date list is more useful.

Chemistry Rules Interface (June 2003)
MDL has announced the Chemistry Rules Interface (formerly Cheshire) a chemical structure manipulation service with major extensions to chemical representation. MDL Chemistry Rules Interface is currently in use at over 125 sites supporting chemical structure representation and business rules implementation.

Indian Academy of Science Journals (June 2003)
The Indian Academy of Science publishes a series of science journals, including several which are chemistry-related. Articles may be downloaded as PDF files.

ICSTI (June 2003)
The International Council for Scientific and Technical Information (ICSTI) has the mission of promoting recognition of the value of scientific and technical information to the world's economic, research, scholarly and social progress, and enhancing access to and delivery of information.

Chem Bio Informatics Society (June 2003)
This Japanese association was founded in 1981, and has sponsored many meetings, financially supported by members of the CBI Company association, which included a many leading Japanese companies in the fields of pharmaceuticals, chemicals and computers as well as informatics. Chem-bio-informatics also has had a division in the Japanese National Institute of health sciences, although it has been reorganised to the Division of Safety Information on Drug, Food and Chemicals.

The Agora (June 2003)
An electronic infrastructure for the scientific community to query, deposit, review, and curate information on biochemistry.

ChemExper (June 2003)
Search for chemical by name or structure. The ChemExper database currently contains over 100,000 structures. The service may be compared to ChemFinder from CambridgeSoft, which provides similar search facilities to a larger database, but only offers a limited service without a subscription.

Metalloproteins (June 2003)
A database run by the Metalloprotein structure and design program at the Scripps Research Institute.

Chemical Thesaurus (June 2003)
This chemical thesaurues is not available directly on-line, but may be freely downloaded. The NIH also has a thesaurus, focussed on alcohol and other drugs which is available on-line.

TOXLine (June 2003)
TOXLINE is the National Library of Medicine's collection of bibliographic information on biochemical, pharmacological, physiological, and toxicological effects of drugs and other chemicals. A component of TOXLINE is available on TOXNET (Chem. Inf. Letters, 2002, 4, #6, June 2002), which has a searching facility.

MOLEKEL (June 2003)
A molecular graphics package for visualizing molecular and electronic structure data, available free of charge. Latest version: November 2002. The program was developed at the University of Geneva, the ETH and the Swiss Centre for Scientific Computing by Peter F Flukiger and Stefan Portmann. The manual is available on-line.

Einstein Archive (June 2003)
On-line access to Albert Einsteinıs scientific and non-scientific manuscripts. The site allows browsing and viewing 3,000 digitized images of Einstein's writings. In addition, the Archival Database allows access to about 43,000 records of Einstein related documents.

BioSimGrid (May 2003)
BioSimGrid is a Grid database for biomolecular simulations, run by collaborators in Oxford, Southampton, London, Birmingham, Nottingham and York, lead by Professor Mark Sansom at Oxford's biochemistry department. The project aims to develop a 'kite-mark' (quality standard) for biomolecular simulations and suitable metadata, in addition to a distributed database.

Chemical Resource Kit (CRK) (May 2003)
The (CRK intends to offer a graphical interface to archive, data entry and presentation of chemical information, as well as useful tools for data modelling. It will also a modern ab initio computational process for calculating chemical structures and properties, and a browser-based access pathway. It has been developed by Alex Clark, until recently a post doc with Professor Chris Reed at UC Riverside, and now at Intellichem.

Grid developments (May 2003)
GRID computing: What a developer needs to know to get started is a short overview from IBM developerWorks. How does Grid compare to P2P? Both are ways of sharing computer power. A paper submitted to the 2nd International Workshop on Peer-to-Peer Systems provides an answer (PDF file). CERN has announced the CERN openlab for DataGrid applications a collaboration between CERN and industrial partners to develop data-intensive Grid technologies to be used by the worldwide community of scientists working at the next-generation Large Hadron Collider. It will handle petabytes of information every year.

AMBER parameter database (May 2003)
This database keeps track of new parameters developed for the AMBER forcefield. It is maintained by Richard Bryce at Manchester University's Department of Pharmacy.

Enhanced stereochemistry from MDL (May 2003)
The description of the stereochemistry of a molecule is a difficult problem, and it is hard to convey the difference between relative and absolute stereochemistry of different centres within a single structure. MDL has developed a method to deal with this problem (Full paper).

Protein Informatics Group (May 2003)
This group at the Oak Ridge National Laboratory in Tennessee develops computational tools for solving problems from molecular biology.

SETAC (May 2003)
The Society of Environmental Toxicology and Chemistry promotes multidisciplinary approaches to solving environmental problems. It was founded in 1979 to balance the interests of academia, business, and government, and has over 5000 members in over seventy coutries, and publishes several journals.

p450 database (May 2003)
The goal of this database is to facilitate access to www resources for researchers working in the field of P450 proteins and P450-containing systems. It is run from the International Centre For Genetic Engineering And Biotechnology in Italy.

DOD Computational Chemistry and Materials Science(CCM) (May 2003)
The The USA Department of Defence High Performance Computing Modernization Program includes chemical projects. Current projects include: Toxicological assessment of FDA-approved treatments for acute organophosphate exposure; Mechanism of action for qinghaosu-based antimalarial drugs; Cyanide prophylaxis.

MathSciNet (May 2003)
The American Mathemetical Society's mathematical reviews on the web, including condensed matter physics, informatics, and computer modelling.

Cheminformatics Institute
This institute offers training in Cheminformatics. The website is not clear about the academic credentials of the institute.

Professor John C Huffman (May 2003)
Professor John Huffman is director of the Indiana University Molecular Structure Center (IUMSC) and a professor of informatics at Indiana University's School of informatics. The IUMSC is the hub of Reciprocalnet, a distributed database used by research crystallographers from fourteen institutions to store information about molecular structures.

RAMBIOS (April 2003)
The Review Articles in Molecular Bioscience (RAMBIOS) database is available from its own web site and from the National Institute of Informatics, Japan, a computer science institute, and Kyushu University's Computing and Communications Centre. It covers more than eighty molecular bioscience journals currently from 1983 to 2002, from which articles are selected and supplied with keywords by experts.

JISC resource guides (April 2003)
The Joint Information System Council (JISC) runs the Resource Discovery Network a collaboration of many educational and research organisations. Particularly relevant to chemistry are BIOME (health and life sciences), EEVL (Engineering, Mathematics and Computing) and PsiGate (physical sciences).

SourceCode Repository (April 2003)
Where can you find small routines, solving a small problem which may be of use in a bigger program? Some of the following sites may be useful. See also Survey of open-source molecular modelling programs some of which may be used to provide useful algorithms.

SciELo (April 2003)
Scientific Electronic Library Online, run as a model for cooperative electronic publishing in developing countries, particularly Latin America and the Caribbean countries. A partnership between FAPESP (State of Sao Paulo Science Foundation), BIREME (Latin America and Caribbean Center on Health Sciences Information) and CNPq (Conselho Nacional de Desenvolvimento Cientifico e Tecnologico). SciELo is a portal to the Brazilian, Chilean and the Cuban sites as well as a regional public health site.

Cactus (April 2003)
The Erlangen/Bethesda Data and Online Services are a collaboration between the Computer Chemistry Centre at Erlangen-Nuremberg and the Laboratory of Medicinal Chemistry at the Centre for Cancer Research, National Cancer Institute, National Institutes of Health. Cactus is a distributed client/server system for the computation, management, analysis and visualisation of chemical information. The intention of these services is to provide to the public structures, data, tools, programs and other useful information. This includes a way to search the open NCI database of compounds - more than quarter of a million compounds.

PASS (April 2003)
PASS (Prediction of Activity Spectra for Substances) predicts the biological activity spectrum for a compound from its structural formula, based on a training set of about forty six thousand biologically active compounds.

Is XML any good? (April 2003)
A recent article by Tim Bray, who worked on the creation of XML, asked Is XML too hard? has catalysed debate about the merits of XML. A number of websites are run by people who do not like XML, including: Site A, Site B and Site C. The criticisms that it is verbose and offers users to much flexibility (for example, by allowing data to be stored both as elements and as attributes) are fairly easily refuted. A more serious criticism is that XML itself is fine, but XSL, XPATH, DTD, DOM, SAX, XML Schema, etc, etc. are complicated and unfriendly, and so the attractive simplicity of XML disappears as soon as it is actually used. After much discussion, the final conclusion is probably: XML is OK.

Androgen database (April 2003)
The Androgen Receptor Gene Mutations Database is a database of mutations of the Androgen Receptor Gene, maintained by Bruce Gottlieb

Pauling and DNA (April 2003)
The race to find the structure of DNA was won fifty years ago this month by Watson and Crick in Cambridge. Records from the archive of Linus Pauling give a documentary history of the events.

Teaching Chemical Information (April 2003)
Many universities have courses on chemical information. The web sites listed range from introductions to using the library, to Masters degrees.

ADMET (April 2003)
"ADMET in the 21st Century" is a conference run by the ACS in Florida, May 4-6, 2003. Speakers include: Scott Boyer (AstraZeneca), David D Christ (Bristol Myers Squibb), Sean Ekins (Concurrent Pharmaceuticals), Steve Hansel (Bristol Myers Squibb), Richard King (Merck), Paula Lapinskas (Vertex Pharmaceuticals), Scott Obach (Pfizer), Louis Plamondon, Donald Tweedie ( Boehringer Ingelheim), Daniel Verber, Ronald E White (Schering-Plough), Clive G Wilson (University of Strathclyde)

Xapian (April 2003)
Xapian is an Open Source Probabilistic Information Retrieval library, released under the GPL.

.edu changes its meaning (March 2003)
Since the beginning of the internet, some sources of information have been consistently reliable. For example, an internet name ending .edu has always been a sign of a university in the USA. On 15th April, 2003, this policy will be modified to allow more institutions access to .edu names. More postsecondary institutions will be able to apply for accreditation from a range of agencies. This is not deregulation, but is a loosening of the restrictions on .edu names.

Substance Registry System (March 2003)
The US Environment Protection Agency has rearranged its web sites. The Chemical Registry System (CRS) and the Biological Registry System (BioRS) maintained by the Agency's Office of Environmental Information, have been replaced by the Substance Registry System (SRS).

NCBI handbook (March 2003)
The National Center for Biotechnology Information (NCBI) creates databases, researches computational biology, develops software for analyzing genome data, and provides biomedical information - to better understand the molecular processes affecting health and disease. Its handbook is a series of (PDF format) chapters on macromolecular structures and genomes.

JEdit, Jext and J Java editors compared (March 2003)
What is the best Java editor? This article compares a number of editors written specifically for Java. Are note pad, Word, or Simpletext as good? They have the benefit of simplicity, but lack Java-specific features.

Hidden Markov Models (March 2003)
Markov Models determine the probability of an event based on the series of events leading up to it. The result of tossing a fair coin is an order zero Markov Model, as the result of each throw is not affected by the previous result. A basket ball match may be simply modelled by an order one Markov Model, as the chance of each team scoring depends strongly on which team scored last. This type of model can be used to analyse DNA sequences, by assuming that the probability of the next base in a sequence depends on the earlier bases in the sequence. Since the probabilities are not know, this is called a Hidden Markov Model. More information is available: Review from the Washington University School of Medicine ; Tutorial and links from UC Santa Cruz.

VICIM (March 2003)
The Virtual Institute for Chemometrics and Industrial Metrology (VICIM) comprises thirteen centres across Europe lead by Professor D L Massart. The Scientific Board is headed by Professor Bernard Vandeginste of Unilever. The participating centres are:

CAZY (March 2003)
The CAZy database describes the families of structurally-related catalytic and carbohydrate-binding modules (or functional domains) of enzymes that degrade, modify, or create glycosidic bonds. It is run by a CNRS group at the Universities of Provence and the Mediterranean, Structure and Function of Biological Macromolecules group.

J-Stage: Japan Science and Technology Information Aggregator, Electronic (March 2003)
Jstage is run by JST, the Japan Science and Technology Corporation. JST was formed by the 1996 merger of the Research Development Corporation of Japan (JRDC) and JICST (the Japan Information Center for Science and Technology). It is one of the three organisations which run STN (Scientific & Technical Information Network): JST (Japan), CAS (North America) and FIZ Karlsruhe (Europe).

ChemKey (March 2003)
Professor Albert Padwa's (Emory University) synthetic method database: $325 for a single CPU. Future updates are free and will be available every year in the future over the internet.

International Symposium on Open Access and the Public Domain in Digital Data and Information for Science (March 2003)
Jointly organised by International Council for Science (ICSU), United Nations Educational, Scientific and Cultural Organization (UNESCO), The U.S. National Academies, Committee on Data for Science and Technology (CODATA) and International Council for Scientific and Technical Information (ICSTI)

IUPAC theoretical organic chemistry acronym glossary (March 2003)
This glossary is an appendix to the 1999 IUPAC report: Glossary of terms used in theoretical organic chemistry (Professor Vladimir I Minkin). IUPAC has also produced an abbreviated list of quantities, units and symbols in physical chemistry.

New Journals (March 2003)
Bentham Science Publishers have announced the launch of fifteen new journals in the fields of pharmaceutical science and molecular medicine. The new journals include Current Proteomics, Current Drug Discovery Technologies, and Current Drug Delivery, all planned for 2004, and several journals on aspects of medicinal chemistry and drug targets, which are already available of planned for this year. The e-mail which reported this helpfully explained "This e-mail is not spamming, as we have taken your name because of your prominent position and your openly available references in the literature."

Electronic journals and retracted articles (February 2003)
Elsevier has removed some articles from the on-line versions of their journals. Some people are outraged (Elsevier's vanishing act), but removal seems a reasonable response. Unlike traditional printed journals it is possible to remove articles which are retracted. Retractions and corrections may easily be missed if only print versions are accessible. Electronic archives can easily replace retracted papers with notices explaining why they were retracted, although it might be useful to keep access to the paper, if it is clearly marked as having been retracted.

Chemical Informatics (February 2003)
What is the difference between Chemical Informatics, Chemoinformatics, Chemiinformatics, or Cheminformatics?

Definitions may be built on 'Chemometrics' which is usually defined as 'statistics on chemical data' (D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte, and L. Kaufmann, Chemometrics: a textbook Elsevier, Amsterdam, 1988).

Chemoinformatics is often defined by citing an article by Frank K Brown ( 'Chemoinformatics: what is it and how does it impact drug discovery.' Ann. Rep. Med. Chem. 1998, 33, 375-384), in which it is defined as ' the mixing of those information resources [information technology and information management] to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the arena of drug lead identification and optimization.' There is a strong link with drug discovery, restricting the subject to this area. Brown says that chemoinformatics includes chemometrics. There seems to be no clear distinction between chemoinformatics, chemiinformatics and cheminformatics. Use of the terms in the chemical literature appears to begin in the late 1990s, and often cites this paper.

'Chemical Informatics', however, is the oldest term, appearing in the literature in the early 1980s. It includes those chemistry-related areas of information handling which are not drug-related, and so chemoinformatics is a subset of chemical informatics. A useful definition is:

Chemical Informatics:
'Computer-assisted storage, retrieval and analysis of chemical information, from data to chemical knowledge'.

This definition is consistent with Gary Wiggins' essay (Indiana University) which addresses the question: What is Chemical Informatics?. The 'analysis' of chemical information may include molecular modelling and other computational methods of using the chemical information contained in Schrodinger's equation to explain and predict other chemical data.

Tetrahedron Info (February 2003)
A new website providing easy access to all resources and information relating to the family of Tetrahedron journals

Paragon (February 2003)
The ACS has just released a new web-based system providing a personalized web page for authors and reviewers.

Institute of Physics full text search (February 2003)
The Institute of Physics has made full text searching available in their Electronic Journals. The facility is free to use and searches the entire archive back to 1874. Access to full text of articles requires a subscription, except for papers published in the last month.

Research Instrumentation and Informatics Department (February 2003)
The University of Miskolc Institute of Applied Chemistry (ME AKKI) has a Research Instrumentation and Informatic Department. This investigates problems in process control and industrial information systems development.

Physico-informatics (February 2003)
In the department of Applied Physics and Physico-Informatics at Keio University, Physico-Informatics emphasizes the importance of the advanced mathematical analysis of information governed by the laws of physics. Professor Masanori Matoba's research interests include 'Spectroscopy and quantum physico-informatics of strongly correlated materials'

Department of Informatics, Torun University (February 2003)
The department of informatics (computer science) at Torun, Poland, has a number of chemistry-related projects, including quantum chemistry and molecular dynamics.

JMol (February 2003)
JMol reads many molecular data formats, and animates and displays three-dimensional molecules. It is an open-source program, like JChemPaint which is also at sourceforge. JChemPaint draws two-dimensional structures. Its development is now being transferred to its successor: JCPCDK, and its underlying library the Chemical Development Kit

Chemistry Development Kit (CDK) (February 2003)
CDK classes are Java utitility classes for ChemoInformatics and Computational chemistry, written in Java, developed from JChemPaint, a Java Editor for 2D chemical structures, and JMDraw. The project involves Chris Steinbeck (University of Cologne's Bioinformatics Centre), Egon Willighagen (University of Nijmegen) and Dan Gezelter (Notre Dame, Indiana). "The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics" Christoph Steinbeck, Yongquan Han, Stefan Kuhn, Oliver Horlacher, Edgar Luttmann, and Egon Willighagen J. Chem. Inf. Comp. Sci. Web Release Date: 11-Feb-2003; (Article) DOI: 10.1021/ci025584y

EMBL (February 2003)
The European Molecular Biology Laboratory is based on five sites in four countries, with over seventy research groups.

JoeLib (February 2003)
JOELib's web page describes it as a platform independent open source computational chemistry package written in Java. It is able to build and edit structures in a variety of formats, calculate some atom property descriptors and create pictures. Its computational chemsitry capabilities seem very limited, so far, and its main strengths are in chemical informatics.

GRID (January 2003)
The GRID - a scheme to distribute computational power over the world, not just data - continues to attract attention. Java-based GRID software is now being developed, complementing the Globus open standard toolkit.

Visualisation (January 2003)
The US Pacific Northwest National Laboratory is developing methods to visualise datasets from diverse disciplines.

c2k (January 2003)
Chemistry 2000 (c2k) continues to provide an up to date index to the worlds university chemistry departments and chemical journals. In January 2003, it indexed 2005 chemistry departments and related resources, and 802 journals. Of these, forty three are marked as having failed to register as relevant and accessible sites in the last automated check, so the database should be more than 98% accurate.

XML and MicroSoft Office (January 2003)
Microsoft has announced that it will move towards an XML file format for office applications in forthcoming releases, to enable better data exchange. A complicated XML document is not necessarily much easier to interpret and modify than a binary file, and it is reasonable to ask how open the new format will be. This seems a useful step up from the widely used RTF format.

European Academy of Sciences (January 2003)
This academy is a "non-profit non-governmental, independent organization of the most distinguished scholars and engineers performing forefront research and the development of advanced technologies". Academy membership is considered a high honour, according to the web site. But what is this academy? There seemed to be no members from England at the end of 2002, and the academy appears to have been founded very recently, although the date of foundation is not mentioned on the web site. What level of distinction does membership of the academy imply? At the moment, the answer is not clear.

Free Software Project for Atomic-scale Simulations (January 2003)
A free software project for atomic scale simulation has been set up following a CECAM workshop.

BCI (January 2003)
Barnard Chemical Information, (BCI), a Yorkshire-based company, provides specialised chemical informatics software and services to clients world wide.

National Bioinformatics Institute (January 2003)
The National Bioinformatics Institute ( has an impressive-sounding name, and, according to its website, supplies 'two of the world's most respected certificates in the field of bioinformatics'. However, the website gives no names and contact information for people within the organisation. There is molecular informatics information copied from various other institutions, and there is a list of NBI board members. When contacted by e-mail, with a request for more information about the institute, none of the board provided any information about the institute, except the name of the CEO, Arnold S. Dion. Dr Dion explained that the claim for world-wide respect may be viewed as an exaggeration. The NBI has no connection with the EBI (European Bioinformatics Institute), despite the superficial similarity of the names.

Public Library of Science (January 2003)
The Public Library of Science (PLoS) is a non-profit organization of scientists committed to making the world's scientific and medical literature a public resource. In December 2002, it received a nine-million dollar grant from the Gordon and Betty Moore foundation to launch free-access biomedical journals (Press release). The PLoS will distribute its journals for free, but will charge authors to publish in them.

Open Source Books (January 2003)
Computer programs can be open source - what about books? Prentice Hall, the academic and reference book publisher, is publishing 'open source' text-books, which are legal to copy, modify, and redistribute, unlike most other books. The web site emphasises this, but the Prentice Hall web pages for the books do not.

ChemBrain (January 2003)
ChemBrain is a worldwide unique chemical database for three-dimensional molecular structures with integrated artificial intelligence, from ExportSoft, for PCs. The web site provides good pictures, but little detailed information. A free trial of the software is availble.

NSDL (January 2003)
NSDL is a digital library of exemplary resource collections and services, organized in support of science education at all levels, funded by the Directorate for Education and Human Resources of the NSF. This might be compared with PsiGate, a UK-funded collection of physical science resources, although the NSDL is not limited to the physical sciences. The first funding cycle was 2000-2002, with the inital release in December 2002.

Volume 5

Molecular Modelling Programs (December 2002)
Last reviewed in CIL 3, #6, December 2001. This year, more programs are included. This list excludes the big commercial offerings and concentrates on programs for which the source code is available in some form and which are available freely or cheaply. Usually there is a license agreement restricting what may be done with the source code.

A molecular mechanics and dynamics program written in C by Professor Robert Harrison at Georgia State's Computer Science Department.

B, formerly Biomer; Free; Source Code; Java. Has moved from its old location to Professor David Case' group at the Scripps Research Institute. The last update was in 2000.

A plane wave/pseudopotential implementation of Density Functional Theory. The CPMD group is coordinated by Professor Michele Parrinello (Director of the Swiss Center of Scientific Computations and Professor at the ETH Zuerich) and Dr Wanda Andreoni (Manager of the Computational Material Science Group at IBM Zurich Research Laboratory). An e-mail discussion list is available to discuss the program.

A quantum chemistry program using SCF, MP2, MCSCF or CC wave functions. The strengths of the program are mainly in the areas of magnetic and (frequency-dependent) electric properties, and for studies of molecular potential energy surfaces. It has been actively developed in 2002, by its authors in Denmark, Sweden and Norway. The main authors are T. Helgaker, H. J. A. Jensen, P. Jorgensen, J. Olsen, K. Ruud, H. Ågren.

Ab initio calculations. Martyn Guest, of the Daresbury Laboratory, is the main author.

Ab initio calculations. The program is maintained by Dr Mark Gordon's research group at the Ames Laboratory.

A computational chemistry software package released under the GNU GPL; C++; Linux. Developed in Finland by Tommi Hassinen and collaborators, and updated in 2002.

Ichmech incorporated in 2001. Free to academics; no changes in 2002.

Open Source Project; Mainly written in Python,with a small amount of C; Konrad Hinsen, from CNRS Orleans. Updated in 2002.

A quantum chemistry package developed by Professor Peter Knowles at Birmingham University and Professor Hans-Joachim Werner at Stuttgart University.

A computational chemistry package that is designed both for workstations and high-performance parallel supercomputers, developed in the William R Wiley Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory, USA.

A free, full source code (Fortran) molecular mechanics and dynamics program, written in the Ponder Lab. Updated in June 2002.

BioMed Central (December 2002)
Do you want access to your research articles to be restricted to those who can afford a subscription to the relevant journals? There are alternatives, including submission to a BioMed Central journal. There is a fee for having a paper published for many of these journals.

PubScience closes (December 2002)
PubScience was a free source of scientific information which was closed because it was simulataneously so important that it was a major threat to commercial publishers and so unimportant that no-one would be inconvenienced by its disappearance. The Software and Information Industry Association lobbied for the closure, following its postion statement: (a) Pubscience enters into commerce; (b) PubScience provides access to a database of bibliographic information that duplicates and competes with databases made available by private sector publishers. The SIIA issued a press release commenting on the closure, commending the decision, as "PubSCIENCE substantially duplicated private sector offerings". Not everybody is happy. Is it fairer to charge researchers for the articles they use than to charge taxpayers for the cost of running a Web site that makes them available for free? The latter may well be the most cost effective way of distributing information to small institutions, who will feel the loss the most.

PSIdev: Proteomics Standards Initiative (December 2002)
The Proteomics Standards Initiative aims to define community standards for data representation in proteomics to facilitate data comparision, exchange and verification. It was founded at a HUPO (Human Proteome Organisation) meeting. Its first step will be standards for protein mass spectrometry data and protein-protein interaction data.

SVG (December 2002)
Scalable Vector Graphics (SVG) 1.1 and Mobile SVG now proposed recommendations by W3C - a step up from a candidate recommendation. SVG is an XML-based graphics format, which allows any image to be encoded as a text file, and does not depend on licensing data-compression programs.

OASIS (December 2002)
OASIS is a not-for-profit, global consortium that drives the development, convergence and adoption of e-business standards. It has a working group on an Open XML Format for Office Applications, which might provide an alternative to the ubiquitous MicroSoft .doc format. This standard will be used by OpenOffice, a free program.

Organic Syntheses (December 2002)
Organic Syntheses is a standard reference publication for organic chemists. It is now available on-line using a ChemDraw plug-in.

Space Simulator (December 2002)
One of the world's fastest computers, using 294 processors in a Beowulf cluster, was built using cheap components. The calculations of interest must not require too much bandwith between the nodes. It is in 85th place amongst the most powerful computers in the world (Top 500) but at less than $1000 per node, it scores well on price per Teraflop.

DSpace (December 2002)
DSpace is a a joint project of MIT Libraries and the Hewlett-Packard Company, providing long-term storage for the digital products of MIT faculty and researchers. MIT's OpenCourseWare already provides material from MIT's courses.

Free Electronic Journals (December 2002)
Elektronische Zeitschriftenbibliothek (Electronic Journals Library) provides an index of journals by subject for which the full text is freely available. Free Medical Journals is a similar site, which is dedicated to the promotion of free access to medical journals over the internet. These lists of journals will change rapidly as publishers change their charging mechanisms, and new journals become available.

ChemVista (December 2002)
An Indian Chemistry Portal for ChemInformatics

Chemical Information (December 2002)
Chemical Information from the National Library of Medicine in the USA.

NIST Chemical Kinetics Database (November 2002)
A compilation of kinetics data on gas-phase reactions covering gas-phase radical chemistry - public beta release 1.1. The NDRL/NIST compilation of kinetics data on solution-phase reactions is also available.

EuroSpec (November 2002)
The International Spectroscopic Data Bank and Archive Project starting in 2002. Nearly all NMR, infra red and mass spectra are unavailable as there is no convenient way to archive this information in an accessible way. EuroSpec aims to change this. This may be compared with the Cambridge Crystallographic Database, which archives crystallographic data.

PubCrawler (November 2002)
PubCrawler is a free "alerting" service that scans daily updates to the NCBI Medline (PubMed) and GenBank databases. Registration required. Run from the genetics department of Trinity College, Dublin, by Professor Ken Wolfe. Not to be confused with PubCrawler which provides a guide to beer.

XML 1.1 (November 2002)
A new version of the XML standard has been proposed by the W3C. It has a minor incompatibility with XML1.0 which may benefit some. The new standard should follow the Unicode specification.

Prof Dr Hans Lohninger (November 2002)
Professor Lohninger's research at the Vienna Technical University concerns the utilization of computers in chemistry.

WorldSciNet (November 2002)
'Publishing at the cutting edge of Science and Technology'. Journals relevant to chemical informatics include: Journal of Theoretical and Computational Chemistry; International Journal of Nanoscience; The American Journal of Chinese Medicine; Journal of Bioinformatics and Computational Biology.

Donall MacDonaill (November 2002)
Lecturer in Physical Chemistry at Trinity College Dublin, whose interests include encodeing digital information on molecules. A recent analysis of information encoding in DNA has been published in Chem Comm 2002, 2062.

Stanford Peers (November 2002)
What is happening in peer-to-peer networking? This Stanford group is developing new techniques.

Shastra (November 2002)
Located at the Center for Computational Visualisation at the University of Texas, the Shastra project (from the Sanskrit word for Science) covers various aspects of visualisation including the molecular world.

MySQL (November 2002)
Bloomberg reports that the major database suppliers may lose some of their market to small cheap competitors: Oracle, IBM, Microsoft Challenged by Free Database Software. PostgreSQL is also mentioned.

GeneWeaver (November 2002)
The GeneWeaver project involves the development of a flexible system for automatic genome analysis and annotation. It has moved from Warwick, to Brunel, to Southampton and to HGMP. It is now at UCL, having left a trail of hyperlinks across the web.

Canada's most powerful computer (November 2002)
Uses the equivalent of over three years of CPU time every day to solve chemical problems. Eighteen universities, including Alberta, Guelph, etc. It is coordinated by chemistry professor Wolfgang Jaeger. An FAQ is available. The calculations will use MOLPRO

MIT Open Courseware (October 2002)
MIT is making many of its courses available on line, for free. This approach is different to the one taken by many other universities. Will this make people more or less likely to pay to study at MIT? The chemistry section so far has only one course, quantum mechanics, which is presented as a series of PDF files. There are probably better chemistry resources at the moment, for example The McGill Office for Chemistry and Society.

Reference Spectra Databases (October 2002)
Reference Spectra Databases from Fiveash data management (mainly FTIR). The spectra are formatted for use with search software packages from other software and FTIR instrument companies.

Predict Protein (October 2002)
PredictProtein is a service for sequence analysis, and structure prediction, developed at Columbia University in New York, and hosted at the EMBL in Heidelberg.

DOI: digital object identifier (October 2002)
Most journal articles now have DOIs - digital object identifiers, which allow unambiguous indexing (Resolve a DOI). In order to assign DOIs, publishers have to buy in to the relevant organisation - CrossRef for science (Membership fees). Lots of questions answered at the FAQ and a partial list of scientific subscribers is available.

COMSEF: Computational Molecular Science and Engineering Forum (October 2002)
A forum for scientists and engineers using molecule-based theories, modelling, and simulation. Activities include Cache - molecular modelling task force for teaching chemical engineering, not to be confused with Cache; an International Comparative Study on Applying Molecular and Materials Modeling and the First Industrial Fluid Properties Simulation Challenge.

Computational Chemistry Comparison and Benchmark DataBase (October 2002)
It can be hard to show unambiguously that one method is better than another. This site, from NIST, provides reference data on 615 gas-phase molecules which can be used to make quantitative comparisons.

Elsevier archive (October 2002)
From August 12th, 1996, all scientific journals published in the Netherlands by Elsevier Science were be archived in the Koninklijke Bibliotheek in the Hague - probably the first such agreement for electronic archiving between a published and a national library. The agreement was strengthened in August 2002, extending to about 1500 journals. Should Elsevier ever leave publishing, the library will be obliged to continue to make the archive available.

Physics Preprint Server (October 2002)
The founder of the widely-used physics preprint server receives a MacArthur fellowship. Cornell is pleased according to their newsletter. The ChemWeb Preprint Server provides a similar service to chemistry, but has not established the same dominant position in chemistry as the physics server has achieved in some areas of physics.

Chemical Registry System (October 2002)
The Chemical Registry System (CRS) provides information on chemical substances and how they are represented in the Environmental Protection Agency (EPA) regulations and data systems.

PhysicsWeb (October 2002)
Following reports of fraud, a physicist has just been sacked for scientific misconduct: Executive summary; Full Report. Genuine mistakes and misinterpretations are also possible, of course.

Journal publications (October 2002)
Patrick Brown is trying to raise money to support the review and publication of results for free. The Journal of Biology is a new journal making results freely available, from BioMed Central. Can it possibly gain enough prestige for it to compete with established and expensive journals? The editorial board is a strong collection of biologists. Would they offer tenure to a member of their own faculty who had published mainly in this journal?

National Oceanographic Data Center (October 2002)
Public access to global oceanographic and coastal data, products, and information.

ChemBank (September 2002)
In March, Professor Stuart Schreiber was awarded a $40 million grant to develop a "Molecular Target Laboratory". A primary goal is to develop a public database, "ChemBank," to link biological processes to the proteins responsible and place in the public domain many of the small molecules that can fuel future drug research. No information on how to access ChemBank seems to be available, but the laboratory has an informatics page on its web site under development.

The Case for Institutional Repositories (September 2002)
What is the future of academic publishing? This paper, from the Scholarly Publishing and Academic Resources Coalition (SPARC) suggests that every institution might create its own archive of information: "Institutional repositories represent the logical convergence of faculty-driven self-archiving initiatives, library dissatisfaction with the monopolistic effects of the traditional and still-pervasive journal publishing system, and availability of digital networks and publishing technologies."

ACD Elucidator Challenge (September 2002)
Is it possible to go from spectra to structure automatically? ACD believes that it is, and is looking for data to test their programs. Very high quality 2D NMR data is required, however.

Teaching chemical information (September 2002)
At the Indiana University "Clearinghouse for Chemical Information Instructional Materials" (part of CHEMINFO) there is information on how to teach Chemical Information.

WHO/TDR Malaria Database (September 2002)
An information resource for scientists working on malaria research, maintained in Melbourne by The Walter and Eliza Hall Institute of Medical Research and at Monash University, and funded by WHO's programme for Research and Training in Tropical Diseases (TDR).

Chiron (September 2002)
The Chiron computer program was developed by Professor Stephen Hanessian at the Université de Montréal, for the analysis and perception of stereochemical features in organic molecules, and to suggest how they may be synthesised.

SimBioSys (September 2002)
Simulated Biomolecular Systems is a software company founded by members of Professor Peter Johnson's group at Leeds University. One of their products, Sprout, is a de novo ligand design system.

Edsger W. Dijkstra (September 2002)
Professor Dijkstra died in August 2002; An archive of his papers is available. One of his most famous publications, from 1968, is: "Go To Statement Considered Harmful"

ASDL - The Analytical Sciences Digital Library - Online September 2002 (September 2002)
The ASDL is an electronic library that collects, catalogs and links web-based information or discovery material (URLs) pertinent to innovations in curricular development and supporting resources in the analytical sciences. It goes on-line in September 2002.

PM5 from Cache (September 2002)
A new semmi-empirical Hamiltonian has been developed, PM5. This is available for all main group elements. Original announcements were made last year. More information on MOPAC 2002 and LinMOPAC 2002 is available.

GRID: Virtual laboratory project (September 2002)
"Enabling 'Molecular Modelling for Drug Design' on the World Wide Grid" is a project run from the School of Computer Science and Software Engineering at Monash University. Reports on using P2P Grids for molecular modelling and drug design are linked from this page.

Biomolecular modelling (September 2002)
The biomolecular modelling group of Cancer Research UK is run by Paul Bates. The former head of the group, Professor Michael Sternberg, moved to be head of the Structural Bioinformatics Group at Imperial College in 2001.

Chinese Medicine (August 2002)
The active molecules in traditional Chinese medicine are beginning to be analysed. A number of resources are now available:

Spectraheap (August 2002)
Software and database of spectral lines for emission analysis. A free demonstration version is available.

Element 118? (August 2002)
The discovery of element 118 has already been retracted, but now there is the suggestion of deliberate fraud. This sort of fraud is very hard to discover, because the experiments are so difficult. If other groups fail to reproduce the results, this does not necessarily mean that the original reports are mistaken or fraudulent. Bell Labs are also investigating the allegation that recent research may not be reproducable.

JOElib (August 2002)
JOELib is a platform independent open source computational chemistry package written in Java. The original C++ version is called OELib from OpenEye. Features include: import and export filter for SMILES; 'SMiles ARbitrary Target Specification' (SMARTS) substructure search; Chemical Markup Language (CML); POVRay export (including aromatic rings)

RXList (August 2002)
An internet drugs index, founded and is maintained by Neil Sandow, Pharm.D. a licensed California Pharmacist. Search brandnames, keywords, etc.

Introduction to Metabolic Biochemistry (August 2002)
A supplement to the lecture with links to public databases, by Lukas K. Buehler, at the University of California, San Diego.

The Merck Manual of Diagnosis and Therapy, Seventeenth Edition (August 2002)
This manual is made available by Merck who also publish the Merck Index, which is not freely available, on-line.

Unilever Centre for Ethics (August 2002)
A centre which serves both the University of Natal and wider South African society (not to be confused with the Unilever Centre for Molecular Informatics).

SpectraGalactic (August 2002)
Spectra Online database is a collection of public domain data from various sources, providing a range of spectroscopic information.

The public's library and digital archive (August 2002)
A collaboration of the Center for the Public Domain (a non-profit foundation) and the University of North Carolina at Chapel Hill. For example a computer science book available for several languages including Java.

Professor Lev Goldfarb (August 2002)
A professor of computer science at the University of New Brunswick, Canada, who is applying informatics to biological systems.

FDA: Elemental Analysis Manual (August 2002)
A repository of the analytical methods used in FDA laboratories, from the center for food safety and applied nutrition. The FDA produces a number of science references from the Office of regulatory affairs.

Open Source Biology (July 2002)
Washington Monthly has an article on how open-source biology. Molecular biology probably has the best examples of any scientific discipline open-source computer programs playing a fundamental role. Can biological techniques and information also be 'open source'?. The concept is the "antithesis of corporatized research". Can it change the way new biology is discovered and exploited?

NCI DIS (July 2002)
The National Cancer Institute 3D structure database: a collection of 3D structures (Chem-X generated) for over 400,000 drugs

(3) (July 2002)
The on-line resource serving the spectroscopy community, from Wiley. It describes itself as a "Community of Interest" website, launched on May 1, 2001, and it "will be the definitive spectroscopy resource on the internet" covering all major spectroscopic techniques. It contains news, articles and links to other WWW resources. It also covers Proteomics and Chemometrics.

Chemometrics (July 2002)
Chemometrics is the study of chemical data. Many web sites exist, including:

Handbook of Fluorescent Probes and Research Products (July 2002)
This on-line handbook contains a large amount of background and technical information, including spectra, on the product line (over 2600 compounds) of Molecular Probes.

XML namespace (July 2002)
XML namespaces are a mechanism to prevent elements and attributes within XML documents being confused when the documents are combined. The XML namespaces recommendation provides a system to do this. The United States Congress is taking notice (XML and Legislative Documents). "The purpose of this website is to provide information related to the ongoing work of the U.S. House of Representatives in relation to the eXtensible Markup Language (XML). Under the direction of the Senate Committee on Rules and Administration and the House Committee on Administration, the Secretary of the Senate and the Clerk of the House have worked together with the Library of Congress and the Government Printing Office to create Document Type Definition files (DTDs) for use in the creation of legislative documents using XML."

Cresset Biomolecular Discovery (July 2002)
This company has a new approach, termed XEDs (extended electron distributions) to better understand the electronic properties of molecules. This technology has led to improved conformational profiles of molecules and better binding energy alculations

A simulated intestine! (July 2002)
Simulations-plus are a software company which can supply a simulated intestine, a useful tool for modelling how drugs may be absorbed, distributed, metabolised and excreted in the human gastrointestinal tract.

Internet scale operating system (July 2002)
Authors of Seti@home envisage an internet scale operating system. This would require huge network bandwidth, but 10 GB ethernet has now been demonstrated, which may lead to new possibilities for distributed supercomputing and remote visualisation.

UDRP (July 2002)
Domain name dispute resolution - a website maintained by Professor Michael Geist of the University of Ottawa Law School, containing studies of how disputes have been resolved.

NAS and wavelets (July 2002)
Wavelets, are a mathematical modelling technique which are more effective that Fourier transforms for some applications, including electronic image formats and the FBI's fingerprint database. This article describes the development of the method.

SPARC (July 2002)
The scholarly publishing and academic resources coalition is campaigning for more competition in scholarly publishing to make research articles in all academic fields more accessible. A recent interview with the CEO of Elsevier suggests that Tetrahedron Letters has more content thant SPARC's Organic Letters and therefore is actually cheaper to access on a per article basis.

Volume 4

ACS backfile (June 2002)
The ACS Web Editions subscription will include only the current year and four previous years. Libraries that do not also buy the Archives subscription will lose access to earlier content, and, each year, to one more year of earlier content. If a library cancels the archives subscription, all access to that content will be lost. The Royal Society of Chemistry has recently announced free backfiles: " Each institutional subscriber is entitled to continuing, perpetual access to and use of the publications to which it has subscribed, published electronically during the period of its paid subscription".

Chemical Consultants (June 2002)
A network of chemical consultants, supported by AIChE (American Institute of Chemical Engineers ) and the ACS (a sub-group of the Philadelphia Section of the ACS). It has several hundred members, mainly located between New Jersey and Delaware.

Infotrieve (June 2002)
Infotrieve, in its own words is "the definitive research portal, leading the market in article research and delivery." The company has endowed a documents delivery service at UCLA. It provides articles from many sources including PubList and MEDLINE.

BLAST faster (June 2002)
AltiVec technology on the PowerMac G4 processor means that BLAST sequence searches can run particularly quickly on Apple's processor.

Prestige Factor gone (June 2002)
A startup company that challenged the ISI establishment and its prestigious Journal Impact Factor, has come and gone. is no more.

WAX (June 2002)
Active library - fast and secure delivery of information through peer-to-peer networks. The company was formed in 1997 and is a 'spin-off' from Dr Iain Buchan's group in the Medical Informatics Unit of the University of Cambridge.

LogP (June 2002)
A guide to Log P and pKa measurements and their use by Mark Earll BSc(Hons) CChem MRSC.

TOXNET (June 2002)
A cluster of databases on toxicology, hazardous chemicals, and related areas, from the National Library of Medicine (USA). The NLM also provides Information on Hazardous Chemicals and Occupational Diseases.

What Every Chemist Should Know about Patents (June 2002)
The 3rd edition of the American Chemical Society Committee on Patents and Related Matter's patent primer, "What Every Chemist Should Know about Patents", is available on the website of the ACS Office of Legislative and Government Affairs. This booklet covers the basics of patent law, with a USA slant.

Open content network (June 2002)
A peer-to-peer system: "We are in the process of creating the Open Content Network, which aims to be the world's largest content delivery network (CDN). Users will soon be able to download open source and public domain software, movies, and music at incredibly fast speeds from this global, distributed network. " Using a new p2p technology, the "Content-Addressable Web", users will be able to use their computers to help the network.

Information for scientists (May 2002)
How can scientific information be made available and kept available at an affordable price? How can the research community in the developing world keep up with the lateest research, when main-stream journals are so expensive? There are many projects which might help to provide answers, including: - The Open Archives Initiative (supported by the Digital Library Federation, the Coalition for Networked Information, and the National Science Foundation) - the open society institute in Budapest (supported by George Soros) - a not-for-profit electronic publishing service committed to providing access to quality research journals published in developing countries (joint initiative of University of Toronto Libraries, Canada, Reference Center on Environmental Information, Brazil and Bioline/UK) - Electronic Publishing Trust for Development (based in Worksop, UK, with trustees from around the world) - International Network for the Availability of Scientific Publications (financial support from British Medical Association, Carnegie Corporation of New York, CDSI, CTA, Danida, French Ministry of Foreign Affairs, National Academy of Sciences (USA), NORAD, Reuters, Royal Swedish Academy of Sciences, Sida, UNESCO and WHO.) - Scientific Electronic Library Online, an electronic library for some Brazilian scientific journals.

Sourceforge (May 2002)
The world's largest open source development web site. Chemical contributions include:

A program to convert between many different molecular data formats.

CML sourceforge
Chemical mark-up language - XML for molecules

JOELib is an open source computational chemistry package written in Java by Jörg Kurt Wegner, which can manipulate a number of structure formats.

A free, open source molecule viewer

VEGA (May 2002)
The Drug Design Laboratory at Milan University is developing VEGA, a bridge between many molecular software packages. It can analyze, display and manage the 3D structures of molecules.

Conceptor and Synergix (May 2002)
Molecular Conceptor is a drug discovery tool from Synergix.

Actelion Property Explorer (May 2002)
A free web page with a Java based molecule editor predicting various drug-relevant properties on the fly, from Actelion - Creative Science for Advanced Medicine

logP calculation (May 2002)
A Java Applet which calculates logP - the partition coefficient between water and n-octanol.

Molinspiration (May 2002)
Property Calculation - logP and drug-likeness.

JME (May 2002)
Peter Ertl's Java Molecular Editor is available free of charge, if the license conditions are fulfilled. More information from Molinspiration.

Model Science Software (May 2002)
Chemistry Lab Simulations for the Classroom, the Laboratory and the Internet, developed in Canada.

Sheffield Chemoinformatics Course (April 2002)
This intensive course provides an introduction to chemoinformatics, with an emphasis on drug discovery applications. Topics covered include: 2D and 3D databases; Diversity; Computational methods; Combinatorial libraries; Analysis of screening data.

Lisp (April 2002)
Lisp is a programming language, invented by John McCarthy, who has written a history. According to McCarthy, it is an approximate local optimum in the space of programming languages, and so continues to be useful.

SyntheticPages (April 2002)
A freely available interactive database of synthetic chemistry, started by Stephen Caddick, Kevin Booker-Milburn, Peter Scott, and Max Hammond.

P3P (April 2002)
Platform for Privacy Preferences Project (P3P), not to be confused with P2P, is a simple, automated way for users to gain more control over the use of personal information, developed by the World Wide Web Consortium.

Skeptical Environmentalist (April 2002)
Just how good is the data on environmentalism? Not very good according to this book, which has stimulated much discussion. The author, Bjorn Lomborg, emphasises the importance of using data well.

Automated organic chemistry reactions (April 2002)
A rule-based software tool that claims it can find a synthesis of a product from starting materials and generate all products of an organic reaction, written by Luca Ermanni. This is a very difficult problem, which many people have tried to solve. How general and how effective is this program?

XML and libraries (April 2002)
Libraries are using XML more. For example:

Standards - how many are there? (April 2002)
World Standards Services Network (WSSN) is a network of publicly accessible World Wide Web servers of standards organizations around the world, including International Organization for Standardization - ISO

Institute of Biomedical Chemistry Russian Academy of Medical Sciences (April 2002)
The institute provides the following services:
  • CPD: Database on Cytochromes P-450
  • ONIX: Computer Program for Visualization and Analysis of 3D Structure of Proteins
  • PASS: Computer System for Prediction of Biological Activity Spectra for Substances
  • KeyLock: Computer database for molecular recognition in protein-ligand complexes

Dymond Linking (April 2002)
Dymond Linking represents some of the most pioneering technology developed in the area of online chemical information offering seamless integration between journal and database content. The Dymond Linking functionality allows you to search a whole world of chemical information with just two clicks of the mouse. Dymond Linking functionality has been incorporated into all papers published in Tetrahedron Letters since 2001) and Tetrahedron since 2002.

GRID (April 2002)
The US Department of Energy Science Grid's major objective is to provide the advanced distributed computing infrastructure based on Grid middleware and tools to enable the degree of scalability in scientific computing necessary for DOE to accomplish its missions in science.

UK QSAR and Cheminformatics Group Spring Meeting 2002 (March 2002)
June 11, 2002, Accelrys Inc, Cambridge, UK. Speakers include Scott Kahn, Accelrys; Jonathan Mason, Pfizer; Dick Cramer, Tripos; Darren Flower, Jenner Institute and Student Presentations.

Gnutella scalability (March 2002)
Is Gnutella, the peer-to-peer information-sharing technology scalable? This thesis investigates the Gnutella network topology, and shows how complicated finding an answer to the question may be.

Cheminformatics Conference (March 2002)
Cambridge Healthtech Institute's Sixth Annual Cheminformatics Conference will take place on May 6-8, 2002, Philadelphia, USA.

Science Base (March 2002) is aimed at everyone who has an interest in chemistry, biomedicine and related sciences, providing news, views and interviews from David Bradley.

Search Engine Communities (March 2002)
A new algorithm maps web-communities of related pages, by the analysis of links. The self-organisation of web communities is a research project at the NEC Research Institute.

Computing Fallacies (March 2002)
An article by Michi Henning, lists some popular fallacies about computing, including: Fallacy 1: Computing is Easy; Fallacy 5: If It's Graphical, It's Easy; Fallacy 11: Standards all the Solution

.NET (March 2002)
What is .NET? Does it matter? Microsoft call it "the new technology driving the next generation of the Web" and it is the Microsoft XML Web services platform.

Journals to become a thing of the past? (March 2002)
Plans to extend free access to scientific and academic research papers have received a boost with the announcement of a $3m grant from financier and philanthropist George Soros' Open Society Institute. "The literature that should be freely accessible online is that which scholars give to the world without expectation of payment." Professor Stevan Hanard, will use the money to develop EPrints self-archiving software, which is intended to allow people to set up all purposes archives on the web, which are OAI compliant.

Pauling notebooks (March 2002)
The notebooks of Linus Pauling are now available on-line.

QSAR at UNC (February 2002)
This server is run from Professor Alexander Tropsha's group and the University of North Carolina. The QSAR server provides information on a number of QSAR methods. The same group also provides the protein structure workbench.

VAM: Valid Analytical Measurement (February 2002)
"The definitive source of information for anyone interested in valid analytical measurement". This site is run by LGC, (formerly the Laboratory of the Govenment Chemist). The site helps organisations in the UK to carry out analytical measurements competently and accurately

CCMS (February 2002)
The Computational Center for MacroMolecular Structures (CCMS) is a joint project of UCSD, The Scripps Research Institute, and the San Diego Supercomputer Center, supported by the National Science Foundation. Programs for analyzing molecular shape and measuring shape complementarity for docked complexes are available for download, including FADE and PADRE.

MDL Alliance (February 2002)
MDL has signed a three-year marketing agreement with Partek a leading provider of statistical and visual pattern recognition software for the scientific research community.

How unbreakable is Oracle? (February 2002)
Oracle has a marketing campaign based on the product being unbreakable. Can this be true?

ACS archive (February 2002)
The American Chemical Society has announced that it has completed the scanning of the ACS Journal Archives, back to 1879, the year of the first publication of the Journal of the American Chemical Society. The on-line links on the JACS page still stop in 1996.

Software competition (February 2002)
The European Academic Software Awards is a biennial competition for developers of academic software within higher education and research in Europe. The submission form will be closed by midnight February 26. The award ceremony will be "a formal happening" and the organisers will secure the festive status of the awards.

Open Archives Initiative (February 2002)
The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication. The initial press release gives the aims of the organisation.

HIV database (February 2002)
The HIV protease database, run by the National Cancer Institute in the USA, is an archive of experimentally determined 3D structures for HIV-1, HIV-2 and SIV.

World Technology Awards (January 2002)
The award winners have been published in Nature. Craig Venter wins the biotechnology award. Gordon Moore wins information technology. George Whitesides wins the materials award, and Richard Friend is a runner up in the commerce section.

ERCIM (January 2002)
European Research Consortium for Informatics and Mathematics: ERCIM news has an issue devoted to Grid computation and e-science

Way Back Machine (January 2002)
Search the web as it was, using this huge archive of web data. For it goes back to July 17th, 1997.

New bandwidth (January 2002)
The pan-European Gigabit Research and Education Network is in full service from December 1st 2001. (January 2002)
SciDev.Net is a free-access, Internet-based network devoted to reporting on and discussing those aspects of modern science and technology that are relevant to sustainable development and the social and economic needs of developing countries.

Seneca (January 2002)
An open-source program for Computer Assisted Structure Elucidation.

FTNMR Free Induction Decay Archive (January 2002)
An archive of NMR FIDs

Named reactions (January 2002)
A database of named reactions, from the University of Connecticut, listing about a hundred reactions.

A-values (January 2002)
A list of A-values, which are measures of the size of chemical groups, from the University of Wisconsin.

Chemistry acronyms (January 2002)
Chemistry is full of acronyms, and this database helps with some of the confusion. It is more specific for chemistry than the CIL glossary.

Volume 3

Deep web searching (December 2001)
The 'deep web', also called the 'invisible web' is the huge amount of information stored in on-line documents and databases which most general-purpose web crawlers cannot reach. It is estimated that the deep web may be 500 times bigger than the 'surface web'. The USA Department of Energy's Science and Technical Information Office is searching the deep web.

NASA release classic software (December 2001)
NASA has released a large archive of software, to celebrate its birthday.

Machine Learning Journal (December 2001)
Forty people resigned from the editorial board of the Machine Learning Journal, and joined the editorial board of the Journal of Machine Learning Research, which is free, and does not demand copyright transfer. Is this the furture of academic publishing? The Scholarly Publishing and Academic Resources Coalition (SPARC) documents how low-cost publication alternatives are being used in many areas of science.

Exposure Assessment Tools and Models (December 2001)
The US Environmental Protection Agency makes tools available to assess the danger of exposure to chemicals.

Chemical Carcinogen Reference Standard Repository (December 2001)
The National Cancer Institute (NCI) operates repositories through subcontractors that supply standard compounds for cancer research.

ChemExper (December 2001)
ChemExper provides a chemical directory, to which anyone can submit data. It currently has more than 70 000 chemicals. How reliable are these data?

Beilstein Institute (December 2001)
Not to be confused with Beilstein, the database operated through Elsevier, the Beilstein Institute is a non-profit making foundation for chemistry. It has recently set up a Chair of Chemoinformatics at Frankfurt am Main

Chemical Education Foundation (December 2001)
This foundation "offers publications and information about product stewardship" and it is "a credible source for information".

Molecular Modelling Programs (December 2001)
This list excludes the big commercial offerings and concentrates on programs for which the source code is available in some form. Usually there is a license agreement.

Free; full source code (Fortran); written in the Ponder Lab.

What If
"a versatile protein structure analysis program that can be used for mutant prediction, structure verification, molecular graphics, etc"
Cheap; A Windows version and Unix version are available.

B, formerly Biomer; Free; Source Code; Java

Open Source Project; Mainly written in Python,with a small amount of C; Konrad Hinsen, from CNRS Orleans

a computational chemistry software package released under the GNU GPL; C++; Linux

Free to academics

Ab initio calculations

ab initio calculations

E-BioSci (November 2001)
A European platform (EC funded) for access and retrieval of full text and factual information in the Life Sciences, run from EMBO. Our aim is to ensure high-quality, peer-reviewed, complete searchable combinations of information that would be made available on the desktop of e very scientist throughout the world,² explains Frank Gannon, EMBOıs Executive Director. It will operate in harmony with PubMed and PubMed Central.

Chemis3D (November 2001)
Chemis3D is a free Java Applet for the online 3D Visualization of Molecular Models.

ChemSpy (November 2001)
An internet navigator for the chemistry industry. It claims to be "the internet navigator..." but it avaerages less than 500 page views per day.

ChemNetBase (November 2001)
ChemNetBase provides a wealth of chemical information from Chapman & Hall/CRC, and offers the possibility of free trial access.

The Gold Book (November 2001)
How do you name molecules? This is the official IUPAC answer. It is not an easy question.

BOPCRIS (November 2001)
British Official Publications Collaborative Reader Information Service: Browse British Official Publications over the period 1688-1995. Useful for contemporary issues. For example: should the British Museum charge an entrance fee? A discussion is available from 1774. The debate continues.

Exploring Modern Computational Chemistry (November 2001)
A conference at Nottingham, July 31st to August 2nd, 2002, will be attended by many of the world's leading computational chemists.

Molecular scale transistors (November 2001)
How small is it possible to make a transistor? It is possible to get down to molecular dimensions, according to this report from Bell Labs.

TeraGrid (November 2001)
Computing in the data decade. The NPACI received $53million to build a teragrid. The NPACI is the National Partnership for Advanced Computational Infrastructure

Peer-to-Peer networking for academia (November 2001)
Will peer to peer networking have an impact on academia?

PSIgate - Physical Sciences Information Gateway (October 2001)
PSIgate is a part of the Resource Discovery Network and funded by JISC. It was launched in Manchester on 10th September. See also HERO, an internet portal for higher education and academic research in the UK.

Julia Goodfellow to head BBSRC (October 2001)
Professor Julia Goodfellow, from Birkbeck College, is to be the new head of the BBSRC, succeeding Professor Ray Baker.

Highly Cited (October 2001)
What do you have to do to be highly cited? The Institute for Scientific Information has a list of the most highly cited scientists. What does CAS think? The Chemical Abstracts Service has another approach to highly cited information: Spotlight.

Masters Degrees in Chemical Informatics
There are now three masters programs in chemical informatics. The three courses all have a number of collaborating departments, require a first degree in chemistry or a related subject, and all of them require a dissertation as a part of the assessment. The Indiana course lasts for two years, and there is a BS option as well as an MS. The UK courses are both one year.

Science Direct (October 2001)
Science Direct is "the premier electronic information service for the interdisciplinary research needs of academic, corporate and educational institutions, offering comprehensive coverage of literature across all fields of science" and is run by Elsevier. It is not very clear from the web site how much it costs to access the service.

IUPAC nomenclature (October 2001)
ACD labs have a web-based version of the IUPAC nomenclature rules. Another source of these data are the pages at Queen Mary and Westfield College, run by Dr Gerry Moss.

Web Start (October 2001)
New product from Sun, which makes it easier to run Java applications. You can now download and launch Java applications without going through additional installation procedures.

OpenEye (October 2001)
OpenEye is a software company whose mission is "to provide tools to address the explosion of chemical data". It is particularly interested in electrostatics and shape. Staff include Anthony Nicholls, who contributed to DelPhi and Grasp, and Roger Sayle, the author of RasMol

Sanger Centre Software (October 2001)
An index of the software available from the Sanger Centre. This include Artemis, a DNA sequence viewer and annotation tool, which is written in Java and can be downloaded with its source code.

Index of Chemists (September 2001)
An index of academic chemists world wide, automatically generated from the Chemistry 2000 list of departments, and regularly checked. The list currently contains almost ten thousand names.

mmCIF (September 2001)
mmCIF is a macromolecular Crystallographic Information File format, based on the earlier CIF format (S R Hall, F H Allen and I D Brown (1991) A new standard archive file for crystallography Acta Cryst, A47, 655-685).

Nature debate on e-access to literature (September 2001)
This forum, on the impact of the Web on the publishing of the results of original research, contains useful articles and opinions. The most recent opinion "The future of the electronic scientific literature" concludes that diversity is required and it would be unwise to be limited to a single economic or technological method.

ENSEMBL (September 2001)
Ensembl is a joint project between EMBL - EBI and the Sanger Centre to develop a software system which produces and maintains automatic annotation on eukaryotic genomes. Ensembl is primarily funded by the Wellcome Trust.

Google's Successor (September 2001)
Google is an extremely effective search engine. Is it the best? If so, will it stay being the best? A number of other search engines are hoping to challenge it, including WiseNut, Teoma, CURE, and Vivisimo

The importance of meta data (September 2001)
Meta data is data about data. Why should this be important? This article explains how it has been important in the development of computer operating systems, and how, with the benefit of hind site, things could have been done better. Perhaps the experience of operating system design will be useful in the application of metadata to other areas.

W3C adopts SVG (September 2001)
SVG, scalable vector graphics from Adobe, have been adopted by the world wide web consortium.

Chemical Java programs (September 2001)
Marvin is a Java based chemistry software that is available as applets, applications or Java beans. A new version has just been released. Marvin Applets and Marvin JavaBeans 2.9, JChem and JKlustor 1.5.10 have been released. The packages contain applications and development tools. The programs draw and display chemical structures, provide chemical database searching capabilities and support for diversity calculations.

Scirus (September 2001)
Scirus is a new science-only search engine from Elsevier. Because it is focussed on science sites, it ought to be much better than other search engines, such as Google when faced with scientific data. But is it? What would constitute a fair test? When used to look for a few chemists, it worked less well than Chemistry 2000 and also less well than Google. However, this test on a few keywords was neither exhaustive nor representative.

Cybernetic Cells (August 2001)
The Scientific American (August 2001) has an article on computer models of cells and metabolic pathways, by W. Wayt Gibbs. 'Supercomputer models of living cells are far from perfect, but they are shaking the foundations of biology'. There are a number of web sites on this subject, including:

BMS acquires DuPont Pharmaceuticals (August 2001)
Bristol-Myers Squibb Company has purchased the DuPont Pharmaceuticals Company, a wholly-owned subsidiary of DuPont, for $7.8 billion.

GNU compiler for Java (August 2001)
The GNU project has released a compiler for the Java programming language. Various resources are available.

CAFASP (August 2001)
The goal of CAFASP is to evaluate the performance of fully automatic structure prediction servers available to the community. In contrast to the normal CASP procedure, CAFASP will answer the question of how well servers do without any intervention of experts, ie how well any user can predict protein structure. This may aid users in choosing which programs they wish to use, and in evaluating the reliability of the programs when applied to their specific prediction targets. Mirrors include:

ClogP (August 2001)
Oft quoted - but what is it? Pomona college has an explanation. On-line calculation is also possible from the DayLight web site. the algorithm is copyright by the BioByte Corporation and Pomona College.

Cycorp (August 2001)
A supplier of 'fomalised common sense' using programs can extract meaning from web pages.

Fugue (August 2001)
A new sequence analysis tool from the Blundell Group in Cambridge.

MICE (August 2001)
The Molecular Interactive Collaborative Environment (MICE) project is developing new methods of collaborative, interactive visualization of complex scientific data. While most existing methods of representing scientific data are static and two-dimensional, the technologies being used and developed for MICE provide interactive, three-dimensional environments within which multiple users can examine complex datasets in real-time. J Mol Graph and Mol Mod 2001, 19, 280-287.

EMP (August 2001)
Enzymes and Metabolic Pathways database, EMP, is a unique and most comprehensive electronic source of biochemical data. It covers all aspects of enzymology and metabolism and represents the whole factual content of original journal publications.

New heads for the EBI (July 2001)
Professor Janet Thornton has been appointed to head the European Bioinformatics Institute (EBI) in Hinxton, UK. She is keen to strengthen the area of "chemoinformatics" and the links between the EBI and the pharmaceuticals industry. She is looking for world class young bioinformaticists, who want to establish their own groups to take forward computational biology in novel areas, in an institute which can provide a superb environment with all the best links into new biological data.

Closure of PubScience? (July 2001)
A powerful congressional committee has passed a budget bill which, if enacted, could close down PubScience, a free search service for the physical sciences literature, operated by the US Department of Energy (DoE). Opposition to PubScience was lead by SIIA

Physical Properties Database (July 2001)
from This on-line interactive demo retrieves data from our PhysProp database of 25,000 compounds.

National Toxicology Program (July 2001)
The National Toxicology Program (NTP) was established in 1978 by the Department of Health and Human Services (DHHS) to coordinate toxicological testing programs within the Department, strengthen the science base in toxicology; develop and validate improved testing methods; and provide information about potentially toxic chemicals to health regulatory and research agencies, the scientific and medical communities, and the public.

Predictive Toxicology Evaluation Challenge (July 2001)
Out of date, but fun. How well can toxicity be predicted? Are the techniques good enough to be useful?

Properties of Organic Compounds (July 2001)
Properties of Organic Compounds is a database that contains over 29,000 common organic compounds, featuring physical data, spectral data and structures. Avaiable for free - for the moment.

Nucleic Acid Database (July 2001)
The Nucleic Acid Database Project (NDB) assembles and distributes structural information about nucleic acids. It is run from Rutgers University in New Jersey.

Failed Reactions Database (July 2001)
Accelrys (formerly MSI, and other companies) is making available a database of failed reactions. This potentially useful information is usually lost, as it is rarely published or otherwise recorded.

Innovative Journals (July 2001)
A Registry of innovative E-journals

Gene Ontology (July 2001)
Could genes be named in a consistent way? Perhaps this is not possible. The Gene Ontology Consortium is producing a dynamic controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. This should create a useful framework without requiring a rigid naming scheme.

Volume 2

Chemical Heritage Foundation (June 2001)
The Chemical Heritage Foundation (CHF) promotes the heritage and public understanding of the chemical and molecular sciences by operating the Othmer library, a historical research library, and running the Beckman Centre for the history of Chemistry.

Synthesis Planning (June 2001)
A list of synthesis planning programs, collated by Koen van Aken

Nucleic Acid Database (June 2001)
The Nucleic Acid Database Project (NDB) assembles and distributes structural information about nucleic acids.

Cheminformatics Course at UMIST (June 2001)
The chemistry department at UMIST have started a new cheminformatics MSc course. The course is full time, but it is anticipated that the course will become available via a distance learning route. It is being run by Dr Andy Whiting and Dr Brian Booth

Cheminformatics and Rensselaer Polytechnic Institute (June 2001)
The Rensselaer Polytechnic Institute runs a cheminformatics course from its IT department.

World Wide Web Consortium Issues XML Schema as a W3C Recommendation (June 2001)
Two years of development produces a comprehensive solution for XML vocabularies: XML Schema What is the difference between DTDs and Schemas? Ds are rules on how the document is to be "formatted". In other words, where certain elements and tags are to be placed within a document. This refers to the document's structure. Eg HTML, BODY As long as those are in the correct order (as specified by the DTD), the document is "verified" correct by a validating parser. Yet, this has nothing to do with the data between those tags. This is where schemas come in. They represent a validation against not only the document's structure, but also the data it contains. You could liken it to the constraint on a database table's field. I.e., CustomerType = V or I (Valid, or Invalid). To continue the example from above, you could specify a schema the restricts the content of the data between the html and body tags.

PubGene (June 2001)
New Scientist: Biologists in Norway have used a computer program to "read" the scientific literature and successfully predict gene interactions. This data-mining of the "biobibliome" provides a way of dealing with the ever-increasing torrent of biological data - millions of papers a year. But even more impressively, the completely automated process can make new genetic discoveries - essentially free research.

State Academies of Science Abstracts (June 2001)
Covers journal of USA state academies, including chemistry. Must pay to get the information

Free net (June 2001)
Freenet is a large-scale peer-to-peer network which pools the power of member computers around the world to create a massive virtual information store open to anyone to freely publish or view information of all kinds. Now beginning to develop an SQL interface.

Semantic Web (May 2001)
A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities, according to an article in Scientific American by Tim Berners-Lee and others. If knowledge representation can be added to the internet it would create a Semantic Web, so computers would have access to structured collections of information and sets of inference rules that could be used to conduct automated reasoning. More information on the semantic web is available from the WWW consortium

Curl (May 2001)
Curl is a new computer language that is designed for client side programs, like Java applets. Will it be a major competitor for Java? It is too early to tell, but the project is supported by Michael L. Dertouzos, Director of the MIT Laboratory for Computer Science and Timothy Berners-Lee, the creator of the World Wide Web and Director of the W3C.

JXTA (May 2001)
What is Project JXTA? Project JXTA started as a research project incubated at Sun Microsystems, Inc. under the guidance of Bill Joy and Mike Clary, to explore new styles of distributed computing. Its goal is to explore a vision of distributed computing using peer-to-peer topology, and to develop basic building blocks and services that would enable innovative applications for peer groups. JXTA is short for Juxtapose. It is a recognition that peer to peer is juxtapose to client server or Web based computing -- what is considered today's traditional computing model.

Calculation of logP in a web browser (May 2001)
The neuro-heuristic laboratory in Switzerland has made an applet for calculating log P available.

Accelrys is the new name for the company formed from MSI, Synopsys, Oxford Molecular and the Genetics Computer Group.

MetaMath (May 2001)
The Metamath Proof Explorer has 60 MBytes of web pages containing over 3000 completely worked out proofs in logic and set theory, interconnected with more than 130000 hyperlinked cross-references. Each proof is pieced together with razor-sharp precision using simple rules, allowing almost anyone with a technical bent to follow it without difficulty

Manchester Bioinformatics (May 2001)
Four academic staff do research on various aspects of bioinformatics, including UMBER a specialist node of EMBnet the European Molecular Biology Network.

Toxicology links (May 2001)
The ACS division of Chemical Toxicology has a list of toxicology links.

Collaborative Computational Projects (CCPs) (May 2001)
CCPs were established to assist universities in developing, maintaining and distributing computer programs and promoting the best computational methods. Each focuses on a specific area of research and are funded by the UK's EPSRC, PPARC and BBSRC Research Councils. The CCP projects were set up by The Central Laboratory of the Research Councils (CLRC) High Performance Computing Initiative (HPCI) Centre.
  • CCP1 - The electronic structure of molecules
  • CCP2 - Continuum states of atoms and molecules
  • CCP3 - Simulation of physical and electronic properties of surfaces and interfaces
  • CCP4 - Protein crystallography
  • CCP5 - Computer simulation of condensed phases
  • CCP6 - Heavy particle dynamics
  • CCP7 - Analysis of astronomical spectra
  • CCP9 - Computational studies of the electronic structure of solids
  • CCP11 - Biosequence and structure analysis
  • CCP12 - High Performance Computing in Engineering
  • CCP13 - Fibre and polymer diffraction
  • CCP14 - Powder and small molecule single crystal diffraction

Special Libraries Association (April 2001)
The Special Libraries Association is a society of specialist libraries, based in the US. It has a Chemistry Division.

ACS Chemcyclopedia on-line (April 2001)
A buyers guide to commercially available chemicals, run by the American Chemical Society

SciDex (April 2001)
SciDex is a software system which can organize scientific data, information and knowledge. It is an object-oriented database management system which can include in only one system knowledge and rules from Chemistry , Biology, Physics, etc. The first applications are: a database on Silicon NMR; Chirbase/GC and Chirbase/CE, database on the gc and ce separation of enantiomers; Landolt-Börnstein - index of organic compounds; and CLAKS, Chemicals Kataster Online System.

Globus (April 2001)
The Globus project is developing fundamental technologies needed to build computational grids. Grids are persistent environments that enable software applications to integrate instruments, displays, computational and information resources that are managed by diverse organizations in widespread locations. The Globus project provides a toolkit, software tools that make it easier to build computational grids and grid-based applications.

Xrefer (April 2001)
This site provides free access to over fifty reference titles containing more than 500,000 entries. The sources include the Oxford University Press dictionary of science and disctionary of scientists and the New Penguin Dictionary of Science.

Public Library of Science (April 2001)
Should the record of scientific research be privately owned and controlled? This site contains an open letter and a campaign to encourage this. The open letter begins:
"We support the establishment of an online public library that would provide the full contents of the published record of research and scholarly discourse in medicine and the life sciences in a freely accessible, fully searchable, interlinked form. Establishment of this public library would vastly increase the accessibility and utility of the scientific literature, enhance scientific productivity, and catalyze integration of the disparate communities of knowledge and ideas in biomedical sciences."

EJI (April 2001)
A Registry of Innovative E-Journal Features, Functionalities, and Content from Iowa State University.

Enthalpy, Entropy and Heat Capacity calculation (April 2001)
HSC Chemistry is the world's favourite thermochemical software, from Outokumpu university. The calculations are based on an extensive thermochemical database which contains enthalpy (H), entropy (S) and heat capacity (C) data for more than 15000 compounds.

Perry's Data (April 2001)
McGraw Hill are making available 'the most complete database of Chemical Engineering information available anywhere!' comprising Perry's Chemical Engineers' Handbook, Lange's Handbook of Chemistry and Yaw's Chemical Properties Handbook.

Common User Agent Problems (March 2001)
This note, from the w3 consortium, gives an account of some common mistakes in user agents due to incorrect or incomplete implementation of specifications, and suggests remedies. It also suggests what should be done where the specifications are incomplete.

Standards for Topic Maps (March 2001)
Topic maps, a concept in organising information, provide a standardized notation for interchangeably representing information about the structure of information resources used to define topics, and the relationships between topics. The standard is available. There are many resources describing these, including "learn more about Topic Maps". An alternate description is also available, in terms of XML cover pages

Thermo Galactic (March 2001)
Galactic has been taken over by the informatics group in the Life sciences division of Thermo Electron a laboratory systems manufacturer.

MSDS databases (March 2001)
A site to search on-line safety data, which comes with a disclaimer that the information should be used at the reader's own risk.

Organic Chemistry On-Line (March 2001)
"A Living Document of Internet Resources, Information and Applications" assembled by Professor Nick Turro of Columbia University and Professor Ron Rusay of Diablo Valley College and UC Berkeley. This site is an index to a wide range of web-based material for organic chemistry. Another organic chemistry text book is available from the Universtiy of Illinois at Springfield.

Web Reactions (March 2001)
Web reactions is an organic reactions retrieval system, offering direct retrieval of reaction precedents through the internet, from J. B. Hendrickson (the author of SYNGEN) and T. L. Sander. It uses a Java applet, which does not run in all web browsers.

Opera (March 2001)
Netscape and Internet Explorer are not everything. Opera is much faster, perhaps because it is rather simpler. A preliminary release is available for download, but it does not have Java support yet.

SpecInfo (March 2001)
Web access to a database of spectra. Free access to about 5000 spectra in order to test the service.

EuroChemWeb and BioPharmaWeb (March 2001)
Two new 'on-line communities' have been started by VerticalNet Europe 'The Internet's leading network of business-to-business trading communities' according to their own press release. The company is a joint venture with British Telecommunications plc and the Internet Capital Group. EuroChemWeb is for chemical and process industry professionals, and BioPharmaWeb is for Pharmaceutical and Biotechnology industry professionals. Despite the similar name, this seems to have no connection to ChemWeb.

Clearing House for Chemical Information Instructional Materials (CCIIM) (February 2001)
The Clearinghouse for Chemical Information Instructional Materials (CCIIM) at Indiana University, collects and distributes items that were developed to instruct you in the use of chemical information sources. These may have been created by chemistry or science librarians, chemists, publishers, or others. The CCIIM was initiated by the chemistry information divisions of the American Chemical Society and the Special Libraries Association.

Mathematical Markup Language (MathML) (February 2001)
The mathematics mark up language test suite was released in December 2000. MathMl is making good progress, despite mathematics requiring "the use of a complex and highly evolved system of two-dimensional symbolic notations." Does this have anything in common with chemistry? MathML is an application of XML. Is mathematics easier than chemistry? CML is also developing.

Gene Expression Markup Language (GEML) (February 2001)
Gene Expression Markup Language (GEML(tm)) GEML, an Extensible Markup Language (XML)-based tag set, was developed by Rosetta Inpharmatics and others in the GEML community to provide a standard method of exchanging gene expression data along with the associated gene and experiment annotation. This has been submitted to the Object Management Group in response to an RFP (Request for proposals) issued by the OMG in March 2000.

MGED (February 2001)
The Microarray Gene Expression Database has also submitted a proposal to the OMG. Which is better? Could there be two standards? MGED has the support of the European Bioinformatics Institute (EBI).

OASIS (February 2001)
Organization for the Advancement of Structured Information Standards (OASIS), is a non-profit, international consortium that creates interoperable industry specifications based on public standards such as XML and SGML (See OMG).

2nd Joint Sheffield Conference on Chemoinformatics: Computational Tools for Lead Discovery (February 2001)
2nd Joint Sheffield Conference on Chemoinformatics: Computational Tools for Lead Discovery

mSQL (February 2001)
A new release of mSQL is expected on February 15th. mSQL is a light weight relational database management system, similar to MySQL and PostgreSQL. It is free, subject to license conditions.

Pre-prints in chemistry (February 2001)
Physicists rely on preprint servers to communicate their work (Physics at Los Alamos and Chemical-Physics at Los Alamos and Brown University. Should chemists do this too? The ChemWeb Preprint Server has been running for some time. The ACS has just produced a formal policy statement on this issue. The ACS have decided that anything revealed on a preprint server should be regarded as a publication, and so will not be considered for publication in an ACS journal.

Science and Celera (January 2001)
Science has accepted a human genome paper from Celera, with unusual arrangements for making the raw sequence data available. Instead of insisting that the authors deposit the data in GenBank or a similar database Science has agreed to an arrangement whereby Celera makes the information available, under some conditions. Science is keeping a copy of the database to ensure that the access will continue to be possible. A statement from the editor of Science is available.

Drug information (January 2001)
A guide to more than 9,000 prescription and over-the-counter medications provided by the United States Pharmacopeia. A service of the USA National Library of Medicine.

MetaChem (January 2001)
A Web-based focal point for access to chemistry information resources of all kinds. Each resource is evaluated, classified and indexed, using the meta data recommendations of the Dublin Core. The site has over 10 000 hits a month, which is about 2-3 % of the traffic for the Cambridge Chemistry Department Web Server.

Centre for Molecular and Biomolecular Informatics (January 2001)
CMBI is based at the University of Nijmegen. Until 1999, it was known as the the CAOS/CAMM Centre. The research interests of the centre include organic synthesis, genome analysis, crystal polymorph prediction, small molecule and biomolecule datamining, and macromolecular structure analysis. The centre also provides databases and software.

Life Sciences Research at the OMG (January 2001)
This group aims to improve the quality and utility of software and information systems used in life sciences research through use of the Common Object Request Broker Architecture (CORBA) and the Object Management Architecture (OMA), to encourage the development of interoperable software tools and services in life sciences research and to use the Object Management Group (OMG) technology adoption process to standardise interfaces for software tools, services, frameworks, and components.

NWChem: high performance computational chemistry software (January 2001)
A computational chemistry package that is designed to run on high-performance parallel supercomputers as well as conventional workstation clusters, developed by the High-performance Computational Chemistry group of the Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory (PNNL). (January 2001)
German chemistry information service. The information is also available in English. This site is updated frequently, and provides an archive of software, tools, jobs, conferences, etc.

Volume 1

Pharma-Transfer (December 2000)
The Pharma-Transfer Internet database brings together previously unpublished pharmaceutical research information from all over the world. Its scope ranges from very early stage, pre-patented discovery ideas right through to proof of concept at Phase IIa, allowing the system to encompass ideas and information from academia, private laboratories and biotechnology groups, as well as from the big pharma companies themselves. The information is provided in concise abstracts, summarising drug development programmes and innovations within developing technologies from commercial and academic research groups worldwide. Guests to the site can gain limited access, in order to assess the quality and usefulness of the database. Guests are allowed unlimited access to all areas of the Pharma-Transfer website, including full use of its comprehensive search capabilities. However,guests will only have access to a very limited number of abstracts By December 2001 it is anticipated that there will be in excess of 32,000 abstracts on the database.

Analytical informatics news (December 2000)
Bio-Rad / Sadtler's newsletter for chemists and spectroscopists.

Journal of the Chemical Computing Group (December 2000)
Chemical Computing Group Inc produce molecular modelling programs, including MOE, the Molecular Operating Environment.

The Open Science Project (December 2000)
The OpenScience project is dedicated to writing and releasing free and Open Source scientific software. Projects include Jmol a molecular viewer and editor.

The Open Lab (December 2000)
The Open Lab and are non-profit organization committed to opening access to bioinformatics research projects, providing Open Source software for bioinformatics by hosting its development, and keeping biological information freely available. They are hosted at the Centre for Intelligent Biomaterials at the University of Massachussetts, Lowell.

JAS (December 2000)
An index to lists of journal abbreviations

MestRe-C (December 2000)
MestRe-C is a PC program for analysing NMR spectra. A similar program for the Apple Mac, Swan is also available.

Who Owns Lecture Notes? (November 2000)
In California, a bill has been passed which answers this question - the lecturer does, in the absence of an agreement to the contrary. This introduces a ban on the commercial redistribution of lecture notes, and requires the Californian Universities to take action.

The Scientific World (November 2000)
A new 'Personal Portal to Science' launched on September 27th, 2000, which includes over 20 000 journals in its database. Articles can also be published through 'i-Publish' a new approach to the traditional idea of a journal, which pays its authors royalties! The scientific advisory board includes Professor Alan Fersht.

Lhasa (November 2000)
Lhasa is a program to help plan organic syntheses, originally developed by E J Corey at Harvard.

The Dublin Core (November 2000)
The Dublin Core is a set of core elements which can usefully be used to structure metadata. The name comes from a workshop in Dublin, Ohio.

BioXML (November 2000)
BioXML is a resource to gather XML documentation, DTDs and tools for biology in one central location. The goal is to provide the biology community with a set of standard XML tags to facilitate data exchange.

Manchester Bioinformatics (November 2000)
The Bioinformatics Unit at Manchester University comprises four groups, headed by Dr Terri Attwood, Dr Andy Brass, Dr Paul Higgs and Dr Erich Bornberg-Bauer. The projects in the unit include studies of carbohydrate structure in solution, protein-protein interactions, polymeric materials, structure and evolution of RNA, and systems for access to multiple biological databases.

JCAMP (November 2000)
The IUPAC working party on spectroscopic data standards is defining the JCAMP-DX format.

Folding@home (October 2000)
Is your computer wasting its time looking for extra-terrestrial life when it could be doing something useful? Solve the protein folding problem instead. Professor Vijay Pande at Stanford has written a program which allows you to do this.

XML Standards for Chemical Industry (October 2000)
A number of companies, including Dow, BASF and DuPont are supporting the initiatives of the Chemical Industry Data eXchange (CIDX) to coordinate XML standardisation for chemical e-commerce.

Undergraduate Chemo-Informatics Laboratory (October 2000)
Dr Henry Rzepa runs an undergraduate Chemo-informatics laboratory at Imperial College.

OpenChem (October 2000)
OpenChem is an open source program for chemistry, written in python and C. At the moment, it has just basic features: molecular construction, vizualization, load and save pdb files, rotate and translate molecules, structure creation and editing, energy minimization. Some more modules are planned.

Relibase (October 2000)
Relibase is a program for searching protein-ligand databases, written by Manfred Hendlich and further developed at the Cambridge Crystallographic Data Centre.

TOXNET (October 2000)
TOXNET (Toxicology data network) is a group of databases for toxicology and hazardous chemicals.

Traditional Chemical Information (October 2000)
The University of Illinois at Urbana-Champaign has put copies of some of its chemistry books on its web site, illustrating how chemical information has been handled over the last five hundred years. Read Priestly, Lavoisier and Dalton in the original.

MSI to buy Oxford Molecular's Software Business (September 2000)
MSI are buying Oxford Molecular. This will strengthen MSI's expertise in bioinformatics and in cheminformatics products. Recently, MSI has also bought Synopsys Scientific Systems.

ChemWeb PrePrint Server (September 2000)
A preprint server for chemistry. Does this fit in with the culture of chemistry? There is a successful physics preprint server, which covers physics, and related disciplines, mathematics, nonlinear sciences, computational linguistics, and neuroscience. The USA Department of Energy also runs a preprint network for physics, materials, chemistry and some biological subjects.

NFCR Center for Computational Drug Design (September 2000)
Oxford University has announced the creation of a Center for Computational Drug Design. Professor Graham Richards will be the director.

An Introduction to Structural Bioinformatics (September 2000)
This course which covers sequence analysis, database searching and molecular modelling applied to proteins will be run at York, Monday 11th - Thursday 14th September 2000.

Synthesis Reviews (August 2000)
A database of 12 268 review articles and books of interest to synthetic organic chemists (1970 - 1999), compiled by Professor Philip Kocienski and Dr Krzysztof Jarowicki of Glasgow University. Synthesis Reviews is not copy-protected and copies can be freely circulated.

Beilstein renamed (August 2000)
Beilstein Informationssysteme was taken over by Elsevier Science in 1998. The name of the company was changed to MDL Information Systems GmbH in July 2000.

DAMP (August 2000)
The Data Management Project (DAMP) is part of the CLRC data management activity. It concentrates on climate researchers and environmental scientists, but also addresses general issues of data storage, data management, data inter-operability, visualisation and emerging computing technologies.

MSc in Chemoinformatics at Sheffield (July 2000)
This course includes Java programming, object-oriented programming and molecular modelling, in addition to a dissertation and lectures on chemical informatics.

CAS statistical summary (August 2000)
A statistical summary of some aspects of Chemical Abstracts (1907-1999) is available

BBSRC call for proposals in Bioinformatics (July 2000)
Applications are invited for BBSRC studentships, starting October 2001. Deadline: August 4th, 2000.


Derek: toxicology prediction (September 2000)
Using a knowledge base, DEREK predicts toxicology information from molecular structures.

Generic Source-Based Nomenclature for Polymers (August 2000)
Provisional recommendations have been produced by IUPAC for polymer nomenclature.

Scalable Vector Graphics (August 2000)
A new graphics format from Adobe, which has the potential to replace GIF, JPEG, etc. Pictures are described in a text format, which is both human-readable and XML. A plug-in from Adobe is required to see the pictures in a web browser. The format includes support for animation and also interaction with the viewer. The World Wide Web consortium has issued a "candidate recommendation" which means that the specification is maturing and is ready for more widespread implementation testing.

Challenges of the Grid (Nature 2000, 406, 331) (July 2000)
Nature has an 'Opinion' on the Grid (see European Grid Forum or Grid Forum) which aims to foster the cooperative use of distributed computing resources. Using XML, CORBA and other protocols it is possible for computers to understand each other's content. How can this opportunity be used?

Science's Neglected Legacy (Nature 2000, 405, 117-120) (July 2000)
An intellectual propert attorney, Stephen M Maurer, writes in a commentary that large, sophisticated databases cannot be left to chance and improvisation.


ASIS 2000: Knowledge Innovations (September 2000)
The American Society for Information Science (ASIS) is holding its annual conference, Saturday 11th - Thursday 16th November 2000, in Chicago.

Chemistry and the Internet: ChemInt 2000 (September 2000)
This conference, organised by Henry Rzepa, Stephen Heller, and Wolf-Dietrich Ihlenfeldt, will examine current and future technologies and applications for chemistry and the internet. It will run 23th - Tuesday 26th September 2000, Georgetown University, Washington, DC.

ACS 220th National Meeting: CINF section (August 2000)
The division of chemical information has a full programme at the Washington DC meeting, August 20th - 24th, 2000.

The Cost Effectiveness of Chemical Information (August 2000)
This meeting, organised jointly with the Chemical Structure Association, will take place at Burlington House on 4th October 2000. To make a presentation, contact the chairman of the chemical information group, Doug Veal.

Chemoinformatics: Computational Tools for Lead Discovery (July 2000)
This is the second joint conference organised by the Chemical Structure Association and the Molecular Graphics and Modelling Society. It will be held in Sheffield, 9th-11th April, 2001. The deadline for submitting papers is 15th September 2000.

Web Sites:

Kartoo (June 2002)
A new search engine, which group the results of searches together as maps. For example, a search on "chemical informatics" (June 2002) gives a map which includes: Cambridge Chemistry (with the largest highlighting blob) linked with the Chemical Informatics pages at Indiana University, and various other sites: (University of Minnesota Biocatalysis/Biodegradation Database) (GenomeNet)

Erlangen/Frederick/Bethesda Collaboration (June 2002)
Also called the Erlangen/Bethesda Data and Online Services - provides structures, data, tools, programs and other useful information. and will probably be most useful for researchers in chemical information and computer-aided drug design (CADD).

Acros Organics (June 2002)
Free search of 6000 ir spectra, from Acros Organics.

Solvent database (May 2002)
The National Center for Manufacturing Sciences makes available Solv-DB: a database of solvents.

History of Scientific Information and Communication (May 2002)
Thomas Hapke, Subject Librarian for Chemical Engineering Technical University Hamburg-Harburg, has created a resource for the history of scientific information.

Cheminformatics in China (May 2002)
The University of Science and Technology of China has a chemical informatics resource.

UKOLG (April 2002)
The UK Online User Group is organising a meeting: "Chemical Information: Maximising the opportunities and minimising the challenges".

Chemindustry Contest (April 2002)
Best chemical website contest, with closing date June 27, 2002. The competition has four sections: Online Courses, Calculators, Tools; Portals and Information Hubs; Chemistry and Engineering Schools; Corporate Sites. Interesting sites have been submitted in each section.

CCCBDB (April 2002)
The Computational Chemistry Comparison and Benchmark DataBase contains experimental and computational thermochemical data for 615 gas-phase molecules and tools for comparing experimental and computational ideal-gas thermochemical properties.

Mineral Web (March 2002)
A web site for the 3-D display of mineral structures

E-Coli (March 2002)
The EcoCyc database describes the genome and the biochemical machinery of E. coli. MetaCyc, a metabolic encyclopedia, contains metabolic pathways from many organisms, but not genomic data.

MOLPRO (March 2002)
Ab initio quatum chemistry programs designed by Professor Hans-Joachim Werner and Professor Peter Knowles.

MIPS (February 2002)
The Munich Information Centre for Protein Sequences, the institute for bioinformatics of the GSF (National Research Center for Environment and Health). Services provided include:
  • Pedant (Computational analysis of complete genomic sequences)
  • Orpheus (Software system for gene prediction in complete bacterial genomes and large genomic fragments)
  • MITOP (Database for mitochondria-related genes, proteins and diseases)
  • Sequence Database Access
  • EU CORBA (Linking Biological Databases using CORBA)

Eric Weisstein's World of Mathematics (February 2002)
This mathematics encyclopedia "the web's most extensive mathematics resource" is back on the web, after an absence because of legal discussions about who owned it.

Virtual Kinetic Laboratory (February 2002)
This site was developed by Professor Thanh Truong at the University of Utah. His research group's software aims to bridge the gap between fundamental chemistry and engineering.

Bioinformatics (January 2002)
A conference on bioinformatics run by IQPC (International Quality & Productivity Center) conference organisers.

TBR (January 2002)
The Bioinformatics Resource, from CCP11. This is the updated web site for CCP11.

Property Calculation (January 2002)
An on-line calculator from the University of Georgia. It can calculate "pKa, Properties, Kinetics and Hydrolysis" from SMILES strings.

Pfam (January 2002)
Protein families database of alignments and HMMs from the Sanger Centre.

MSc in Molecular Modelling (January 2002)
Cardiff University has started an MSc course in molecular modelling, from October 2001.

CHI's chemical informatics glossary (December 2001)

Reality Grid (December 2001)
Moving the bottleneck out of the hardware and back into the human mind

6th International Conference on Chemical Structures (December 2001)

BioPerl (November 2001)
The Bioperl Project is an international association of developers of open source Perl tools for bioinformatics, genomics and life science research. Perl has a key role as it saved the human genome project! The bioperl servers reside in Cambridge, Massachusetts USA with facilities donated by Genetics Institute. (November 2001)
A guide to chemistry, run by About.

CHI - chemical informatics conference (November 2001)
First Announcement and Call for Papers for the Cambridge Healthtech Institute's Sixth Annual Chemoinformatics conference.

ChemSoc Time Line (October 2001)
A time line for chemistry. How has the subject developed since the beginning of the universe?

Developing Bioinformatics Computer Skills (October 2001)
A new book on skills specific for bioinformatics. What skills are needed? Unix, Perl, and SQL figure prominently.

BioCarta (October 2001)
Charting the pathways of life... A developer, supplier and distributor of uniquely sourced and characterized reagents and assays for biopharmaceutical and academic research. The BioCarta web site serves as an interactive web-based resource for life scientists.

MEROPS (September 2001)
The MEROPS database provides a wealth of information on proteases. It is run from the Babraham Institute

Minnesota Biocatalysis/Biodegradation (September 2001)
Microbial biocatalytic reactions and biodegradation pathways primarily for xenobiotic, chemical compounds.

What is there? (WIT) (September 2001)
Interactive Metabolic Reconstruction on the web.

BRENDA (August 2001)
A collection of enzyme functional data available to the scientific community free of charge for academic, non-profit users.

ENZYME (August 2001)
Enzyme nomenclature site, based on the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB)

The Biotechnology Division: the National Institute of Standards and Technology (August 2001)
This is an index to databases of the thermodynamics of enzyme-catalyzed reactions, short tandem repeat DNA, and biological macromolecule crystallization.

NTIS (July 2001)
The National Technical Information Service is the U.S. Government's central source for the distribution of scientific, technical, engineering, and related business information. This information is produced by or for the U.S. Government and complementary material from international sources.

Electronic Lab Notebooks (July 2001)
Can a paper lab note book be replaced by an electronic lab note book? This company thinks the answer is yes.

Gale Rhodes (July 2001)
Gale Rhodes, Professor of Chemistry at the University of Southern Maine has written web tutorials on crystallography and macromolecular graphics.

Chemical Informatics Notes (June 2001)
Lecture notes on chemoinformatics by Dr Christoph Steinbeck of the Max-Planck-Institute of Chemical Ecology

Rosetta Inpharmatics (June 2001)
Rosetta Inpharmatics producecs informational genomics solutions. Its mission is to innovate and integrate technologies in computational and molecular biology to catalyze discovery for the life sciences industry. It has just been bought by Merck.

Spectroscopy Now (June 2001)
On line resource serving spectroscopy, including: Mass Spectrometry (incorporating Base Peak) X-ray Spectrometry NMR (incorporating NMR Knowledge Base) Chemometrics (incorporating Chemometrics World)

Molecular Modelling Laboratory (May 2001)
Molecular Modelling Laboratory, run by Alexander Tropsha

Graphics programming (May 2001)
The Graphics Programming Black Book by Michael Abrash is now available on the web.

Dr Oliver Smart's home page (May 2001)
Dr Smart's primary research interest is in using computational methods to link biomolecular function and structure.

Project Gutenberg (April 2001)
A collection of books available on-line. None of these books are still in copyright. This generally means that the texts are taken from books published pre-1923, and includes the Bible, Shakespeare, the Declaration of Independence, Lewis Carroll, etc.

datagrid (April 2001)
DataGrid is a project funded by the EU. The objective is to enable next generation scientific exploration which requires intensive computation and analysis of shared large-scale databases, from hundreds of TeraBytes to PetaBytes, across widely distributed scientific virtual communities.

Peer-to-peer (April 2001)
Peer to peer working group is a consortium for advancement of infrastructure standards for peer-to-peer computing. Many companies, including Intel are interested. See also OpenP2P. There is a company called Peer to peer but does it have anything to do with peer to peer? Java is joining in with JXTA.

Chemical Information for Organic Chemistry (March 2001)
This site is run from Columbia University, and contains a list of chemical information sources.

Chicago University Chemistry Library (March 2001)
The Chicago University Chemistry Library is an impressive example of how on-line information can be managed. The librarian is Andrea Twiss-Brooks, the webmaster for the ACS division of Chemical Information.

Molecular Knowledge Systems (March 2001)
Software for physical property estimation and molecular design

Impact Factors (February 2001)
Which is the most important journal? Impact factors are the average number of citations per paper per year, and so give a measure of the importance of a journal.

Linux for Chemistry (February 2001)
This site, which is linked from the WWW Virtual Library for Chemistry, maintains a list of chemistry programs for Linux.

Amos' WWW links page (February 2001)
This list contains links to information sources for life scientists with an interest in biological macromolecules (protein sequence, 3D structure and 2D-gel analytical tools are provided on the ExPASy server, and from its Proteomics tools page)

Learning Chemistry through Java (January 2001)
Professor Andrew Rappe from the University of Pennsylvania has used Java to create some educational resources for chemistry.

Classic Chemistry (January 2001)
This website, run by Professor Carmen Giunta of Le Moyne College, New York, has a number of classic chemistry papers, a glossary of archiac chemical terms, and a history of chemistry calendar, giving month by month chemical anniversaries.

Marvin Applets (January 2001)
Marvin Applets 2.6, Marvin JavaBeans 2.6 and JChem 1.5 have been released. These Java tools are useful for building internet, intranet, distributed or standalone chemical applications.

Molecular Networks (December 2000)
Molecular Networks provides software for chemical and biochemical sciences. Areas of interest include: rational drug design, combinatorial chemistry, organic reactions and synthesis, process development, structure elucidation and data analysis. Products include the 3D structure generator CORINA, and PETRA (Parameter Estimation for the Treatment of Reactivity Applications) which calculates physicochemical effects in molecules.

Search Engine Watch (December 2000)
How can search engines be used most effectively? This site presents some answers.

Information retrieval in chemistry (December 2000)
A ChemInformatics site, based in the Institute of Physical Chemistry NCSR 'Demokritos', Athens.

National Biotechnology Information Facility (November 2000)
The NBIF is located at New Mexico State University and provide a single point of access to a vast store of widely distributed biotechnology data as well as developing new educational and bioinformatics services.

BioTech (November 2000)
BioTech is a biology/chemistry educational resource and research tool located in the laboratory of Andrew Ellington at the University of Texas at Austin. It has many links and articles on various biological and chemical topics.

Chemical Genealogy Database (November 2000)
A large database tracing the scientific "ancestry" of chemists back through their PhD advisors. The database shows some University of Illinois bias, but is a interesting general data source.

Professor Curt Breneman (October 2000)
Professor Curt Breneman research centres on physical organic and computational chemistry. He is also involved in a project for the automated design and discovery of novel pharmaceuticals using semi-supervised learning in large molecular databases.

National Centre for Biotechnology Information (October 2000)
The NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information

Information Systems for Biotechnology (October 2000)
The ISB provides information for environmentally responsible use of agricultural biotechnology. It is funded by the United States Department of Agriculture and based at Virginia Tech.

Chemical Informatics at Indiana University (September 2000)
The School of Informatics in Bloomington (Indiana University) offers an MS in Chemical Informatics. Core courses include organic chemistry. A wide range of electives are available, including C++ programming and many chemistry courses.

Cheminformatics at GenChemics (August 2000)
GenChemics (GCC) is a contract research company, specialising in accelerating lead discovery and optimisation using cheminformatics, bioinformatics and computer science.

Chemoinformatics Group at the Max Planck Institute of Chemical Ecolog y (July 2000)
This group works on computer-assisted structure elucidation and is developing the JChemPaint editor.

Human Genome Project Working Draft Sequence (July 2000)
A preliminary assembly of the current draft of the human genome.

MIRAGE WWW server (August 2000)
Molecular Informatics Resource for the Analysis of Gene Expression

Trinity University Cheminformatics Site (August 2000)
This site is run by Professor Steven Bachrach, the Editor-in-Chief of the Internet Journal of Chemistry, whose research interests include computational chemistry and use of the internet in chemistry.

Professor Frank Hollinger (September 2000)
Professor Frank Hollinger runs computational chemistry and chemoinformatics courses at the Stevens Institute of Technology.

Professor Jie Liang (University of Illinois at Chicago) (July 2000)
Professor Liang's research interests include structural bioinformatics, cheminformatics, drug discovery, data mining, and computational biology. He directs the molecular informatics laboratory in the Bioengineering Department.

UK-QSAR and ChemoInformatics Group (September 2000)
This group organises twice yearly meetings. The web site includes a bulletin board.

© 2000-2006 J M Goodman, Cambridge; Chemical Informatics Letters ISSN 1752-0010
Cambridge Chemistry Home Page CIL Chemical Calculations Goodman Research Group Chemical Information Laboratory Webmaster: J M Goodman