Working with Research Data
Assistance with Statistics & Informatics at OU
- Data Analytics and VisualizationScience Librarian Claire Curry has created the Data Analytics and Visualization research guide to connect OU library users to local services, data sources, analysis tools, and sources of literature on experimental design, software, and evidence synthesis across many disciplines.
- Data Services @ OU LibrariesDigital Scholarship and Data Services is a unit that supports OU community members with their data needs. Consult with specialists and graduate assistants who are familiar with working with data, including management, analysis and visualization. With the ability to ask questions and receive guidance, Data Services: Research Data is the focal point to help faculty, researchers, and students work with their data.
- OU Supercomputing Center for Education and ResearchThe OU Supercomputing Center for Education & Research, a division of OU Information Technology, helps undergraduates, grad students, faculty and staff to learn and use advanced computing in their science and engineering research and education.
Life Sciences Datasets
- EcoCyc E. coli Database
EcoCyc is a scientific database for the bacterium Escherichia coli K-12 MG1655. The EcoCyc project performs literature-based curation of the entire genome, and of transcriptional regulation, transporters, and metabolic pathways.
- Gene Ontology Resource
"The Gene Ontology (GO) knowledgebase is the world’s largest source of information on the functions of genes. This knowledge is both human-readable and machine-readable, and is a foundation for computational analysis of large-scale molecular biology and genetics experiments in biomedical research."
- Human Genome Project
The Human Genome Project (HGP) refers to the international 13-year effort, formally begun in October 1990 and completed in 2003, to discover all the estimated 20,000-25,000 human genes and make them accessible for further biological study.
- iDigBio specimens online
"Making data and images of millions of biological specimens available on the web"
- IUCN Red List of Threatened Species
"Contains over 75,000 assessments of species, subspecies, varieties and subpopulations covering a variety of taxa"
- MetaCyc: Metabolic Pathway Database
MetaCyc is a curated database of experimentally elucidated metabolic pathways from all domains of life. MetaCyc contains 2453 pathways from 2788 different organisms
- Movebank
Animal movement data from a variety of sources (GPS tracking, geolocators, etc.)
- National Science Foundation's National Ecological Observatory Network (NEON)
"NEON collects environmental data and archival samples that characterize plant, animals, soil, nutrients, freshwater and atmosphere from 81 field sites strategically located in terrestrial and freshwater ecosystems across the U.S."
- AquaDocs
AquaDocs is an open access repository covering the natural marine, estuarine/brackish and freshwater environments. It includes all aspects of the science, technology, management and conservation of these environments, their organisms and resources, and the economic, sociological and legal aspects.
- NBII (Archive Site)
National Biological Information Infrastructure website was sacrificed by the Federal Government in 2012. The University of North Texas has archived portions of this resource.
- Qiita Datasets
"Qiita allows users to download public data as well as the user’s own private data. This data can then be used for processing and analysis in external tools."
- TAIR: The Arabidopsis Information Resource
The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana
- Smithsonian NMNH Department of Botany Collections Online
The plant collections of the Smithsonian Institution began with the acquisition of specimens collected by the United States Exploring Expedition (1838-1842). These formed the foundation of a National Herbarium which today numbers over 5 million historical plant records, placing it among the world's largest and most important. Over 4.2 million specimen records (including over 115,000 type specimens with images) are currently available in this online catalog
Environmental Sustainability Datasets
- Chemical Effects in Biological Systems (CEBS)
NIEHS supported public data sets.
- Chemical Entities of Biological Interest (ChEBI)
Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. (European Bioinformatics Institute)
- Comparative Toxigenomics Database (CTD)
CTD illuminates how environmental chemicals affect human health.
- DataONE (Data Observation Network for Earth)
Data Observation Network for Earth (DataONE) is the foundation of new innovative environmental science through a distributed framework and sustainable cyberinfrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data.
- Environmental Genome Project
The NIEHS Environmental Genome Project is a multi-disciplinary, collaborative effort focused on examining the relationships between environmental exposures, inter-individual sequence variation in human genes and disease risk in U.S. populations.
- IASSIST Data Sources
International Association for Social Science Information Services & Technology data sources.
- National Institute of Environmental Health Sciences Databases
Links to 12 science related data sets supported by NIEHS.
- OpenDOAR
Directory of Open Access Repositories.
Bioinformatics
- ELIXIR Bioinformatics Tools and Data Services Registry
Well-curated directory with robust searching and filtering capabilities
- FDA: Bioinformatics Tools
List of tools for working with datasets
- Bioconductor
Bioconductor provides tools (in R) for the analysis and comprehension of high-throughput genomic data.
- Qiita
"Qiita (canonically pronounced cheetah) is an entirely open-source microbial study management platform. It allows users to keep track of multiple studies with multiple ‘omics data. Additionally, Qiita is capable of supporting multiple analytical pipelines through a 3rd-party plugin system, allowing the user to have a single entry point for all their analyses. Qiita’s main site provides database and compute resources to the global community, alleviating the technical burdens, such as familiarity with the command line or access to compute power, that are typically limiting for researchers studying microbial ecology. Qiita’s platform allows for quick reanalysis of the datasets that have been deposited using the latests analytical technologies. This means that Qiita’s internal datasets are living data that is periodically re-annotated according to current best practices."
Experimental Design
Biological Sciences
- Springer Protocols
Provides access to full text reproducible laboratory protocols in the life and biomedical sciences.
- Protocols.io
"Create and discover reproducible experimental and computational methods with video, reagents, detailed parameters, and more."
- IACUC: Alternatives to Animal Models: Literature Search
"The regulations of the AWA require that investigators provide Institutional Animal Care and Use Committees (IACUCs) with documentation demonstrating that alternatives to procedures that may cause more than momentary pain or distress to the animals have been considered and that activities do not unnecessarily duplicate previous experiments. A thorough literature search regarding alternatives using relevant sources helps to meet this Federal mandate. "
- Alternatives Search: Demonstrating Compliance
"Searching for alternatives means considering ways to reduce, refine, or replace whenever there is proposed animal use in research, teaching, or testing. This guide focuses on the Animal Welfare Act (at left, August 1966) and US regulatory compliance in research and education. It is, however, an international concern, and most countries have animal welfare laws and regulations that also require a consideration of alternatives."
- ALTBIB - Alternatives to Animal Testing
"Bibliography on Alternatives to the Use of Live Vertebrates in Biomedical Research and Testing. The National Library of Medicine (NLM) developed ALTBIB to provide access to PubMed citations for users seeking information on alternatives to animal testing. Many citations provide access to free full text."
- Practical Computing for Biologists byCall Number: QH 324.2 .H33 2011ISBN: 9780878933914Publication Date: 2010"Although many of the techniques are relevant to molecular bioinformatics, the motivation for the text is much broader, focusing on topics and techniques that are applicable to a range of scientific endeavors."
- The Analysis of Biological Data byCall Number: QH 323.5 .W48 2009 TEXTBOOKSISBN: 9780981519401Publication Date: 2008The authors "motivate learning with interesting biological and medical examples; they emphasize intuitive understanding; and they focus on real data. The book covers basic topics in introductory statistics, including graphs, confidence intervals, hypothesis testing, comparison of means, regression, and designing experiments." Available on reserves.
- Intuitive Biostatistics byCall Number: R 853 .S7 M68 2018ISBN: 9780190643560Publication Date: 2017"Takes a non-technical, non-quantitative approach to statistics and emphasizes interpretation of statistical results rather than the computational strategies for generating statistical data."
- Experimental Design and Data Analysis for Biologists byCall Number: QH 323.5 .Q85 2002ISBN: 0521009766Publication Date: 2002"For students or researchers in biology who need to design experiments, sampling programs, or analyze resulting data. [...] The chapters include such topics as linear and logistic regression, simple and complex ANOVA models, log-linear models, and multivariate techniques. The main analyses are illustrated with many examples from published papers and an extensive reference list to both the statistical and biological literature is also included."
- Experimental Design for Biologists byCall Number: QH 323.5 .G565 2014ISBN: 9781621820413Publication Date: 2014"This handbook explains how to establish the framework for an experimental project, how to set up all of the components of an experimental system, design experiments within that system, determine and use the correct set of controls, and formulate models to test the veracity and resiliency of the data." A philosophical guide to experimental design, rather than a statistically focused approach.
Biomedical Science
- The Ethical Challenges of Human Research byCall Number: R 853 .C55 M55ISBN: 0199896208Publication Date: 2012-10-23
Ecology and Evolutionary Biology
- Analysing Ecological Data byISBN: 1281044679Publication Date: 2007"The first part of the book gives a largely non-mathematical introduction to data exploration, univariate methods (including GAM and mixed modelling techniques), multivariate analysis, time series analysis (e.g. common trends) and spatial statistics. The second part provides 17 case studies, mainly written together with biologists who attended courses given by the first authors. [...] The case studies can be used as a template for your own data analysis; just try to find a case study that matches your own ecological questions and data structure, and use this as starting point for you own analysis."
- A Primer of Ecological Statistics byCall Number: QH 541.15 .S72 G68 2013ISBN: 9781605350646Publication Date: 2012"Explains fundamental material in probability theory, experimental design, and parameter estimation for ecologists and environmental scientists. "
- The Ecological Detective byCall Number: Oklahoma Biological Station Stacks QH 541.15 .M3 H54 1997ISBN: 0691034966Publication Date: 1997"The Ecological Detective makes liberal use of computer programming for the generation of hypotheses, exploration of data, and the comparison of different models. The authors' attitude is one of exploration, both statistical and graphical."
- Handbook of Meta-Analysis in Ecology and Evolution byCall Number: QH 541.15 .S72 H36 2013ISBN: 0691137293Publication Date: 2013"The handbook identifies both the advantages of using meta-analysis for research synthesis and the potential pitfalls and limitations of meta-analysis (including when it should not be used). Different approaches to carrying out a meta-analysis are described, and include moment and least-square, maximum likelihood, and Bayesian approaches, all illustrated using worked examples based on real biological datasets."