Based on their extensive experience with teaching R & statistics to applied scientists, the authors provide a beginner's guide to R. To avoid the difficulty of teaching R & statistics at the same time, statistical methods are kept to a minimum.
This book is written for S but can be used for R with minimal modifications. Available as an ebook or physical copy. "A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods. The emphasis is on presenting practical problems and full analyses of real data sets."
"Multivariate analysis includes methods both for describing and exploring such data and for making formal inferences about them. The aim of all the techniques is, in general sense, to display or extract the signal in the data in the presence of noise and to find out what the data show us in the midst of their apparent chaos."
From the introduction: "R makes it easy to work with and learn from data. It also happens to be a complete programmming language, but if you’re reading this guide then that might not be of interest to you. That’s OK — the goal here is not to teach you how to program in R. The goal is to teach you just enough R to be confident to explore your data. In this guide, we use R in the same way we use any other statistics software: To check and visualise data, run statistical analyses, and share our results with others. To do that it’s worth learning the absolute basics of the R language and key recent extensions to it. "
"This lesson is an introduction to programming in Python for people with little or no previous programming experience. It uses plotting as its motivating example [...] This lesson references JupyterLab, but can be taught using a regular Python interpreter as well. Please note that this lesson uses Python 3 rather than Python 2."
"This textbook provides an introduction to the free software Python and its use for statistical data analysis. It covers common statistical tests for continuous, discrete and categorical data, as well as linear regression analysis and topics from survival analysis and Bayesian statistics. Working code and data for Python solutions for each test, together with easy-to-follow Python examples, can be reproduced by the reader and reinforce their immediate understanding of the topic."
"This Handbook summarizes, explains, and demonstrates the nature of current models, methods, and techniques particularly designed for the analysis of spatial data. The book is designed to be a desk reference for all researchers just getting into the field of spatial data analysis as well as for seasoned spatial analysts. "
"Although many of the techniques are relevant to molecular bioinformatics, the motivation for the text is much broader, focusing on topics and techniques that are applicable to a range of scientific endeavors."
"This methodology will not find every bug in every program, but it is highly effective for the sort of short programs that beginner programmers are assigned as homework. These techniques then scale up to finding bugs in non-trivial programs."
"The United States Indigenous Data Sovereignty Network (USIDSN) helps ensure that data for and about Indigenous nations and peoples in the US (American Indians, Alaska Natives, and Native Hawaiians) are utilized to advance Indigenous aspirations for collective and individual wellbeing. USIDSN’s primary function is to provide research information and policy advocacy to safeguard the rights and promote the interests of Indigenous nations and peoples in relation to data."
"an open source textbook aimed at introducing undergraduate students to data science. [...] In this book, we define data science as the study and development of reproducible, auditable processes to obtain value (i.e., insight) from data." Uses R's tidyverse packages and Jupyter notebooks.
Call Number: QA76.73.P98 V365 2016 (Youngblood Energy Library)
Publication Date: 2016
"For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all--IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools. Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. "
"Python Data Analytics will help you tackle the world of data acquisition and analysis using the power of the Python language. At the heart of this book lies the coverage of pandas, an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. "
"Qiita (canonically pronounced cheetah) is an entirely open-source microbial study management platform. It allows users to keep track of multiple studies with multiple ‘omics data. Additionally, Qiita is capable of supporting multiple analytical pipelines through a 3rd-party plugin system, allowing the user to have a single entry point for all their analyses. Qiita’s main site provides database and compute resources to the global community, alleviating the technical burdens, such as familiarity with the command line or access to compute power, that are typically limiting for researchers studying microbial ecology. Qiita’s platform allows for quick reanalysis of the datasets that have been deposited using the latests analytical technologies. This means that Qiita’s internal datasets are living data that is periodically re-annotated according to current best practices. "