Recent Posts

More Posts

“Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language, and is open source and open development.” https://www.bioconductor.org/ library("dplyr") library("ggplot2") Installation To install core packages, type the following in an R command window. This may take around 5 minutes When the option for updating packages appears, type in “a” for “all” #leave as eval = FALSE when knitting if (!

CONTINUE READING

Why doesn’t my R look like your R [software]? RStudio says that it cannot find the R binaries. We cannot install software (on iPad, Chromebook, etc.) tidyverse cannot be found [package] is not available for R version … there is no package ‘rlang’ there is no package ‘broom’ rlang and/or broom still do not work How do we know which is the RMD file and which is the HTML file?

CONTINUE READING

If you are planning to do the R assignments on your own computer (recommended), then here is a quick outline for obtaining the software. There are two separate software programs. Most people find it easier to use RStudio. than just R, but you need to install R first before installing RStudio (analogously speaking: you need an cell phone before you can use an cell phone case). If you have R and RStudio from a previous course, you still need to update to the current versions!

CONTINUE READING

Duolingo, the language learning app, places users in groups of 50 and assigns a league to each user to encourage competition. The leagues are Bronze, Silver, Gold, Sapphire, Ruby, Emerald, Amethyst, Pearl, Obsidian, and Diamond (in that increasing order) What proportion of Duolingo users are in each league? The rules are everyone starts in the Bronze League the top 15 percent of each group gets promoted to the next league up (measured weekly) the bottom 10 percent of each group is related downward In this post, I will try out some stochastic processes calculations to answer that question.

CONTINUE READING

Here I will plot some of the hikes I have done as elevation (from sea level) versus distance. I was inspired by this Reddit post Today’s code was great practice with geom_segment geom_label_repel and using xlim and ylim to expand the plot. library(ggrepel) library(tidyverse) library(readxl) df_info <- read_excel("hikes.xlsx", sheet = "info") df_info %>% print() ## # A tibble: 9 x 6 ## Region trail distance trailhead elevation peak ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> ## 1 Tahoe Mt Tallac 5.

CONTINUE READING

Teaching

I am a teaching instructor for the following courses at the University of California at Merced:

  • Math 15: first semester data science for life science students
  • Bio 18: second semester data science for life science students
  • Bio 184 (TBD): Python for DNA analysis

Contact

  • dsollberger@ucmerced.edu
  • Derek Sollberger
    School of Natural Sciences
    5200 North Lake Road
    Merced, CA, 95343
  • email for appointment