Thursday, September 11, 2008


Wikipedia covers a wide range of genomic, biologic and engineering topics, many in detail and with reasonable exprtise.

Friday, August 24, 2007


SciVee provides a place for scientists to upload a short video about a recent article -- the equivalent of a conference presentation. In its current beta version, only articles published in PLOS can be SciVee'd, but the next version will accept any article in PubMed Central. The majority of journal publishers put articles in PubMed Central 6-18 months after publication.

SciVee is a joint effort of the Public Library of Science (PLoS), the National Science Foundation (NSF) and the San Diego Supercomputer Center (SDSC.)

Monday, August 20, 2007

Bio5/MRB InfoSources Series, Fall 2007

Wednesdays - August 29 - September 26, 2007
Noon in Keating, Room 103
Free lunch provided

Aug 29 What impact are you making (and where can you get funding for it?)
Two brief presentations on:
- Journal Impact Factors via the JCR database, and open access journals.
- Search grant sources such as Community of Science and CRISP.

Sept 5 UA InfoSources - the key to research success
A speedy tour of ebooks, ejournals, databases, interlibrary loan and other services available to Bio5 researchers.

Sept 12 The Business of Science
BioScan and other Business Databases - How to locate information about U.S. and foreign companies actively involved in biotech research and development.

Sept 19 SciFinder Scholar 2007
An overview of the latest interface to Chemical Abstracts Online, providing highly customizable searching.

Sept 26 RefWorks & Web of Science
A Find it > Save it > Cite it workshop. Two brief presentations on:
- Using Web of Science to track the impact of your research and follow a topic both forwards and backwards in time
- Learn how to gather and organize your article and book citations for a variety of projects online using RefWorks.

Thursday, August 2, 2007

NC3Rs - Mouse databases

NC3Rs - Mouse databases is an extremely throrough and up-to-date guide to mouse information -- genomic, proteomic and strains. This NC3Rs guide describes online resources under these headings:
  • Conditional mouse models databases - 2 resources
  • Transgenic and knockout databases - 3 resources
  • Mouse strain repositories - 4 resources
  • Mouse pathology - 2 resources
  • Mouse genetics and genomics - 8 resources
  • Comprehensive mouse links - 4 resources
The sidebar links to more interactive resources, such as working group white papers and listservs.

Monday, July 30, 2007

Medicago truncatula -- MedicCyc

A recent article from Bioinformatics discusses a visual biochemical pathway database -- MedicCyc -- for Medicago truncatula -- "Metabolic pathway reconstruction was used to generate a pathway database for M. truncatula (MedicCyc), which currently features more than 250 pathways with related genes, enzymes and metabolites."
Urbanczyk-Wochniak E, Sumner LW. MedicCyc: a biochemical pathway database for Medicago truncatula. Bioinformatics. 2007 Jun 1;23(11):1418-23. Epub 2007 Mar 7.

Tuesday, July 24, 2007

Illumina BeadChip platform

Illumina Seminar
Keating Room 103 -- Large Meeting room off of Lobby

GATC will be hosting a seminar from Illumina concerning the applications of their "next generation sequencing"platform. The topics being presented are:
  • Whole Genome and Candidate Resequencing
  • SNP Mutation Discovery
  • Digital Gene Expression Profiling
  • ChIP by Sequence Tagging
  • Small RNA/miRNA Identification and Quantitation
All are welcome to attend, the talk should last about an hour and there will be
time for questions after. Call or email Ryan Sprissler / 520.621.9184 /

A recent article describes IlluminaGUI -- a graphical user interface implemented for analyzing microarray data from the Illumina BeadChip platform

Article -- IlluminaGUI: Graphical User Interface for analyzing gene expression data generated on the Illumina platform


Monday, July 16, 2007

CRISPRdb database


The CRISPRdb database is a browsable sequence-database of "clustered regularly interspaced short palindromic repeats" from Archeae and Bacteria.

Each month, the CRISPRFinder program searches several sources for new CRISPRs, and updates the CRISPRdb database. CRISPRFinder can be downloaded and used to search locally generated data as well.

- About
- Link

Browse by
- View the strains taxonomy browser
- View the strains alphabetical browser
- View the strains in database processing order

Friday, July 6, 2007

Allen Brain Atlas

The Neuroscience Gateway, published in association with Nature, focuses on the genetics of the mammalian brain. Its Allen Brain Atlas can be searched by gene names, markers and symbols, anatomic structure or fine structure annotation. It can be browsed by
  • Structure -- Cerebellum / Cerebral cortex / Hippocampal region / Hippocampal formation / Hypothalamus / Lateral septal complex / Midbrain / Medulla / Olfactory bulb / Pons / Pallidum / Retrohippocampal region / Striatum-like amygdalar nuclei / Striatum / Striatum dorsal region / Striatum ventral region / Thalamus
  • Gene symbol
  • Images -- Coronal or Sagittal
Searches return basic information about the gene, plus links to NCBI and other sources. Running your mouse over the Epression Level/Expression Density bar changes the image to show the anatomic area.

Monday, July 2, 2007

Plant genomic/proteomic sources

PlantGDB -- Plant Genome Database. "an NSF-funded project ... to develop plant species-specific EST and GSS databases, to provide web-accessible tools and inter-species query capabilities, and to provide genome browsing and annotation capabilities."
- About
- Link

MIPS Plants Databases - Munich Center for Protein Sequences set of databases "focuses on the bioinformatics of plant genomes. It developed from the Arabidopsis genome annotation group." Databases include maize, Arabidosis, rice, tomatoes, lotus, and Medicago. Notable for manual annotations.
- About
- Link

iMap -- "Maize Mapping Project is to develop a fully integrated genetic and physical map for maize. To display this integrated map, we have developed iMap. iMap has three main components: a relational database (iMapDB), a map graphic browser (iMap Viewer) and a search utility (iMap Search). iMapDB is populated with current genetic and physical map data, describing relationships among genetic loci, molecular markers and bacterial artificial chromosome (BAC) contigs."
- About
- Link

Other databases -- "DataBases: Web sites for Chromosome Researchers." Scroll down to Plants.

Friday, June 29, 2007

Seminar - July 12, 11-1:00

You are invited to a SPECIAL SEMINAR about “Novel Technology in Biosensors”
July 12th, 2007
11:00 am-1:00 pm
Kiewit Auditorium, Arizona Cancer Center

Octet Using Biolayer Interferometry for Affordable,
Rapid, Label-Free, Real Time Kinetics Analysis.

Kathi Williams
West Coast Field
Application Scientist

Lunch will be provided.

Science News Sources

A number of web sites focus on science news.

EurekAlert! is one of the best, produced by AAAS. For Bio5, the agriculture, biology and medicine sections are worth following each day. Its primary source is press releases from science centers for governments, universities and journals.

ScienceDaily is a far more commercial venture, but it includes a breaking news section updated every 15 minutes. Its Health and Plants sections are most interest to us.

Science news aggregators such as Google News, Yahoo News, CNN and BBC tend to offer stories about science news, instead of straight science links.

Yahoo and Google both have directory listings for science news sources, with subcatories.

Monday, June 25, 2007

Human Protein Atlas - images

The Human Protein Atlas is a database of "high resolution images of immunohistochemically stained tissues and cell lines" for normal and cancerous human proteins. The database can be searched by chromosome-location or by keyword. Note that every character -- including spaces -- is searched. "FTCD" is not the same as "FTCD ". Searchable terms include gene name, antibody ID (either CAB001519 or 1519 works), and descriptor terms such as protein names. Click on the Antibody ID to view annotation data; click on the link-dots to view Ensembl/NCBI/RefSeq/Uniprot info.

Friday, June 22, 2007

MedGadget: Biomedical engineering blog

MedGadget covers both consumer and research biomedical gadgets. Postings cover topics such as nasal drug delivery units, museum openings, nanotube carriers and weekly news reports on the latest biomedical engineering feats.

Thursday, June 21, 2007

Nature's Blogs

The journal Nature supports fifteen blogs, ranging from chemistry to climate to methods to biology.

Seven Stones -- "The Molecular Systems Biology blog on systems and synthetic biology" -- works in conjunction with Nature's Molecular Systems Biology page and the European Molecular Biology Organization (EMBO.) Seven Stones covers systems and synthetic biology, including technical, scientific and societal issues. Conference presentations are summarized and discussed, as well as the journal literature.

Nature blogs provides links to news, and work with Connotea as well as Precedings.

Wednesday, June 20, 2007

Nature Precedings launches

"Nature Precedings is a free online service that enables researchers to rapidly share, discuss, and cite their early findings." It is modelled on the Pre-Print services common in astronomy and physics. Like other Web 2 services, it makes use of user-generated tags as well as formal subjects to classify information.

There is a discussion on the use of Blogs in research --

Friday, June 15, 2007

Translational research at UA

UA received a Clinical and Translational Science Award (CTSA) to explore state-wide focus on translational research -- moving data from the lab to the bedside. Two major groups have come out of this at UA & ASU: ACTREC & Clinical Scholars Circle.

Tuesday, June 12, 2007

Sandwalk blog

Sandwalk - - is another biomedical blog, this one dedicated to biochemistry. Moran's "Monday Molecules" series offers a molecular challenge to be rewarded with a free lunch. The Bio5 librarians might want to emulate this challenge for our Bio5 folks.

Thursday, May 31, 2007

CutDB: a proteolytic event database

The Burnham Institute for Medical Research offers several databases, including PMAP CutDB. CutDB is a database of proteases and their proteolytic events - including predicted events - from experiments and the literature, at It can be searched by a wide range of field, listed to the right. Some entries link to extensive lists of events such as furin, while others have only one entry such as griselysin.
  • Clicking on the name in [Protease_definition] may lead to an entry in the PMAP-Proteases database.
  • Clicking on the [Substrate definition] leads to its NCBI Protein entry.
  • Clicking on [Structure] links may crash your web browser - it seems to be highly MSIE/default settings dependent.
  • Clicking on [Details] leads to an entry with substrate/structure, cut-site, cell line and original citation.

Tuesday, May 29, 2007


KEGG LIGAND is an alternative to the NCBI databases, covering the "molecular building blocks of life in the chemical space." Its interlinking is simpler and more up-front than NCBI's -- if not as complete. The Reaction database is easy to use. Compare diethylene glycol in NCBI and LIGAND.

Susumu Gotoa Takaaki Nishioka, and Minoru Kanehisa.
LIGAND: chemical database of enzyme reactions.
Nucleic Acids Res. 2000 January 1; 28(1): 380–382.

Thursday, May 24, 2007

NCBI course - Principles of PubChem

What is PubChem?
A public repository of electronic representations of small molecules and associated bioactivity assay data
- new program -- link chem informatics to bio-informatics
- A component of the NIH Molecular Libraries RoadMap
- Part of the NCBI Entrez search and linking system
- A system of four components: molecular libraries
--PubChem Substance DB
--PubChem Compound DB
--PubChem BioAssay DB
--PubChem Structure Search / tool like blast / vast --> Grants
compund repository (MLSMR)
molecular libraries small molecule repository
molec lib screening center netwk MLSCN
predictive ADMET

The National Center for Biotechnology Information
What does NCBI do?
Accepts submissions of primary data.
Develops tools to analyze these data.
Uses these tools to create derivative databases based on the primary data.
Provides free search, linking, and retrieval of data, mainly through the Entrez system.

entrez - text / seq - blast / protein stru - vast / sm molec struc - pubchem

pubchem Types of Databases
=Primary Databases
Original submissions by experimentalists
Content controlled by the submitter
Examples: GenBank, SNP, GEO, PubChem Substance and BioAssay
=Derivative Databases
Built from primary data
Content controlled by third party (NCBI)
Examples: RefSeq, RefSNP, GDS, PubChem Compound

PubChem Databases
substance = real chemicals / non redundant
bioassay = experimental

PC Substance Record
structure display / subID = sid + [compund id=cid] / link to depositor / chem nomenclature
? (iupac names from ncbi)

Non-uniformity in PC Substance - diff ways to draw a chemical
The Bizarre / non-standard in PC (pubchem) Substance (chamomile tea, grapefruit)

PubChem Compound

Standardize Structures
Verify Chemical Data
Atom description (label, element)
Functional group clean-up
Atom valence verification to prevent non-sense structures
“Normalize” and “Standardize”
Valence-Bond canonicalize (for Tautomer invariance)
Aromaticity detection and self- consistency
Stereochemistry detection
Explicit hydrogen assignment
Structural Representations
2D Coordinate generation
Images created
Structures that fail to standardize…
Have no records in PC Compound
Cannot be searched by structure

Stereoisomers in PC Compound (chiral sugars)

PubChem Compound continued
- Calculate Properties and Links
- Structural Information
Calculate & store “Fingerprints”
Calculate & link to similar structures (90% level)
- Physical Properties
Molecular Formula
Molecular Weight
Number of H-bonds donor/acceptor sites
XLogP value
Lipinski value (bioavailability)
Number of Rotatable bonds
- Links to NCBI Database Records
Structures (MMDB records)
Protein sequences (from Structure links)
Genes (from Protein links)
- Links to MeSH Terms through IUPAC name
("believe it or not, but people read every article and assign mesh to them" ... :-)

PC Compound Record - all the data, most complete
1 CID / bioactivity / links to substances
2 MeSH Links - use pubchem to do chem medline searches!
3 Calculated Properties
vendors / descriptors

Handling Mixtures
SID / CID / links to unique components & their cid's

PC BioAssay Record
Tables - active etc / overlap made non redundant

BioAssay Protocol
methods / procedures - no std's / text explanations & links to web

PubChem integration in Entrez
-What is Entrez?
-System of 31 linked databases
-Text search engine
-Tool for finding biologically linked data
-Data retrieval engine
-Virtual workspace for manipulating large datasets
-Free public access

Entrez review
chemical[synonym] (all)
chemical[completesynonym] (exact)
Atom abrev[element]
[filter] = structure, rules
[pharmaction] = mesh

pubchem Search page
Entrez Limits page, very diff from medline
Entrez History

Display - Downloading Reports
- property report - one line
- BULK -- pubchen download - long records, goes to a URL for a week

Linking in Entrez
- hard = biol / chem
- soft = computed, algorhithm links

PubChem Links
related struct / assays / literature (pmc = free) / other entrez db

Linking in Bulk
Use DISPLAY link for list --> "pubchem bioassays" ...

The PubChem FTP Site

Programming Tools

PubChem Help -- excellent