->

Biological data exploration with Python, pandas and seaborn Clean, filter, reshape and visualize complex biological

3 Jun. 2020 | English | B089M41Y1F | PDF + Files | 398 pages | 83 MB


 

In biological research, we're currently in a golden age of data. It's never been easier to assemble large datasets to probe biological questions. But these large datasets come with their own problems. How to clean and validate data? How to combine datasets from multiple sources? And how to look for patterns in large, complex datasets and display your findings?

The solution to these problems comes in the form of Python's scientific software stack. The combination of a friendly, expressive language and high quality packages makes a fantastic set of tools for data exploration. But the packages themselves can be hard to get to grips with. It's difficult to know where to get started, or which sets of tools will be most useful.

Learning to use Python effectively for data exploration is a superpower that you can learn. With a basic knowledge of Python, pandas (for data manipulation) and seaborn (for data visualization) you'll be able to understand complex datasets quickly and mine them for biological insight. You'll be able to make beautiful, informative charts for posters, papers and presentations, and rapidly update them to reflect new data or test new hypotheses. You'll be able to quickly make sense of datasets from other projects and publications - millions of rows of data will no longer be a scary prospect!

In this book, Dr. Jones draws on years of teaching experience to give you the tools you need to answer your research questions. Starting with the basics, you'll learn how to use Python, pandas, seaborn and matplotlib effectively using biological examples throughout. Rather than overwhelm you with information, the book concentrates on the tools most useful for biological data. Full color illustrations show hundreds of examples covering dozens of different chart types, with complete code samples that you can tweak and use for your own work.

This book will help you get over the most common obstacles when getting started with data exploration in Python. You'll learn about pandas' data model; how to deal with errors in input files and how to fit large datasets in memory. The chapters on visualization will show you how to make sophisticated charts with minimal code; how to best use color to make clear charts, and how to deal with visualization problems involving large numbers of data points.

Chapters include

Getting data into pandas: series and dataframes, CSV and Excel files, missing data, renaming columns

Working with series: descriptive statistics, string methods, indexing and broadcasting

Filtering and selecting: boolean masks, selecting in a list, complex conditions, aggregation

Plotting distributions: histograms, scatterplots, custom columns, using size and color

Special scatter plots: using alpha, hexbin plots, regressions, pairwise plots

Conditioning on categories: using color, size and marker, small multiples

Categorical axes:strip/swarm plots, box and violin plots, bar plots and line charts

Styling figures: aspect, labels, styles and contexts, plotting keywords

Working with color: choosing palettes, redundancy, highlighting categories

Working with groups: groupby, types of categories, filtering and transfog

Binning data: creating categories, quantiles, reindexing

Long and wide form: tidying input datasets, making summaries, pivoting data

Matrix charts: summary tables, heatmaps, scales and normalization, clustering

Complex data files: cleaning data, meg and concatenating, reducing memory

FacetGrids: laying out multiple charts, custom charts, multiple heat maps

Unexpected behaviours: bugs and missing groups, fixing odd scales

High performance pandas: vectorization, timing and sampling

Further reading: dates and s, alternative syntax

 

Biological data exploration with Python, pandas and seaborn Clean, filter, reshape and visualize complex biological

 

 


 TO MAC USERS: If RAR password doesn't work, use this archive program: 

RAR Expander 0.8.5 Beta 4  and extract password protected files without error.


 TO WIN USERS: If RAR password doesn't work, use this archive program: 

Latest Winrar  and extract password protected files without error.


 Themelli   |  

Information
Members of Guests cannot leave comments.




rss