Hacker News Books

40,000 HackerNews book recommendations identified using NLP and deep learning

Scroll down for comments...

The Elephant in the Brain: Hidden Motives in Everyday Life

Kevin Simler, Robin Hanson, et al.

4.4 on Amazon

36 HN comments

The Shallows: What the Internet Is Doing to Our Brains

Nicholas Carr

4.4 on Amazon

34 HN comments

Behave: The Biology of Humans at Our Best and Worst

Robert M. Sapolsky

4.7 on Amazon

33 HN comments

Spark: The Revolutionary New Science of Exercise and the Brain

John J. Ratey MD and Eric Hagerman

4.7 on Amazon

32 HN comments

The Gene: An Intimate History

Siddhartha Mukherjee, Dennis Boutsikaris, et al.

4.7 on Amazon

29 HN comments

Superforecasting: The Art and Science of Prediction

Philip E. Tetlock and Dan Gardner

4.4 on Amazon

29 HN comments

Elements: A Visual Exploration of Every Known Atom in the Universe

Theodore Gray and Nick Mann

4.8 on Amazon

28 HN comments

“Surely You’re Joking, Mr. Feynman!”: Adventures of a Curious Character

Richard P. Feynman , Ralph Leighton , et al.

4.6 on Amazon

28 HN comments

Let My People Go Surfing: The Education of a Reluctant Businessman--Including 10 More Years of Business Unusual

Yvon Chouinard and Naomi Klein

4.6 on Amazon

27 HN comments

How Not to Be Wrong: The Power of Mathematical Thinking

Jordan Ellenberg

4.4 on Amazon

27 HN comments

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data

Hadley Wickham and Garrett Grolemund

4.7 on Amazon

26 HN comments

The Master and His Emissary: The Divided Brain and the Making of the Western World

Iain McGilchrist

4.6 on Amazon

26 HN comments

Beyond: The Astonishing Story of the First Human to Leave Our Planet and Journey into Space

Stephen Walker

4.7 on Amazon

25 HN comments

When: The Scientific Secrets of Perfect Timing

Daniel H. Pink and Penguin Audio

4.5 on Amazon

25 HN comments

Carrying the Fire: An Astronaut's Journeys

Michael Collins

4.8 on Amazon

24 HN comments

Prev Page 3/14 Next
Sorted by relevance

tarsingeonJuly 11, 2021

I found the book R for Data Science (which is free http://r4ds.had.co.nz) to be a very good introduction to R with Tidyverse.

alexhutchesononDec 20, 2018

Hadley Wickham's book "R for Data Science"[1] does a good job emphasizing the data cleaning and reshaping steps in the analysis process.

[1] https://r4ds.had.co.nz/

thenipperonFeb 16, 2017

Also check out: R for Data Science: http://r4ds.had.co.nz/

It's got a lot of good intro materials for R. Though having some understanding of another programming language would be pretty helpful.

nthotonNov 22, 2016

R for Data Science is a good R equivalent by Hadley Wickham. It also acts as a high level overview of the hadley/tidy verse (ggplot2, tidyr, dplyr, etc.). R4DS is free online [1].

[1] http://r4ds.had.co.nz/

staplungonApr 8, 2016

Hadley Wickham is the creator of ggplot2. His online book R for Data Science has a chapter on visualization which, besides being a good tutorial on using ggplot2, happens to be the most concise explanation of the theory that informs its design (the grammar of graphics).

gordon_shotwellonFeb 16, 2017

The R for Data Science book really is the best, but I would also recommend https://www.datacamp.com/ for some great video tutorials.

cdcrabtreeonDec 25, 2018

I'd second the R recommendation. It's a particularly exciting time to use R, with the proliferation of package collections like tidyverse (https://www.tidyverse.org/) and the development and release of free texts like R for Data Science (https://r4ds.had.co.nz/).

alexhutchesononOct 8, 2019

If you're interested in learning how to do data cleaning and restructuring in R, I highly recommend the "Wrangle" chapters of Hadley Wickham's R for Data Science book, which you can read online here: https://r4ds.had.co.nz/wrangle-intro.html

clumsysmurfonDec 22, 2016

"R for Data Science" by Garrett Grolemund &
Hadley Wickham was recently completed.

http://r4ds.had.co.nz/

The ebook is free online, you can buy from Amazon & O'Reilly too.

jeroenjanssensonDec 7, 2017

The book "R for Data Science" by Garrett Grolemund and Hadley Wickham (O'Reilly, 2017) [1] provides a comprehensive introduction to modern R and a set of packages known as the tidyverse. Highly recommended.

[1] http://r4ds.had.co.nz/

sdabdoubonJune 14, 2018

Hadley Wickham's R for Data Science[1] is a generally good starting reference.

[1] http://r4ds.had.co.nz/

wodenokotoonApr 24, 2020

The “R for Data Science” book by Hadley Wickham (creator of tidyverse, and I believe he is chief data scientist at R Studio) is hands down one of the best introductions to data exploration and analysis.

minimaxironApr 23, 2017

The "Advanced R" title is not joking, and will be less useful for people who are not already familiar with the language.

For more common knowledge of R, see Hadley's book R for Data Science. (HN discussion: https://news.ycombinator.com/item?id=12513985)

DumblydorronJuly 11, 2021

Use the tidyverse and use the cheatsheets for dplyr and the book R for Data Science. If you're trying to use primarily base R, you'll be limited and hamstrung. Tidyverse is the modern framework of choice.

sxvonDec 15, 2019

Felt the same way for years coming from the python world. The R for Data Science book[0] was a game changer in making R enjoyable for me.

[0] https://r4ds.had.co.nz/

bart_spoononApr 24, 2020

As the other commenter said, probably the best resource out there is "R for Data Science" by Hadley Wickham, the architect of the tidyverse. It will get you up and running with the most important parts of the tidyverse (dplyr for data manipulation, ggplot2 for data visualization, tidyr for general data cleaning utilities, etc). Its available online for free here: https://r4ds.had.co.nz/

wodenokotoonNov 16, 2019

Start with R for data science (book by Hadley, available online for free)

Tidy verse assumes tidy data. If you are not working with tidy data, it is unlikely to be a big help. Most data can probably be thought of as tidy.

Remember that any and every operation on a data frame returns a data frame, so unlike chaining in Pandas, you never have to worry if a method you want to use belongs to a series or a data frame, or if your method is returning a series or a data frame.

Select() selects columns, filter() selects rows. This never changes unlike the [] which means different thing depending on if it is used on a data frame (which you are not guaranteed to be served after calling a method on data frame in pandas!) on a series or using the .loc or .iloc methods.

There is no index, instead you just filter on rows.

Pandas comes with a ton of build in utilities which the tidyverse doesn’t, mostly because R is already full of functions you can easily apply across columns.

But particularly pandas date handling functions are really cool

thousandautumnsonApr 26, 2019

I've never looked at Data Science in R, but Hadley Wickam's R for Data Science is great in my opinion. Really applicable, down to earth, and focuses much more on the meat of data science (data manipulation and munging, visualization, relational data, and efficient programing) more than the typical "fit a neural network to this idealized toy data set!"

Its also available for free online at https://r4ds.had.co.nz/

minimaxironAug 10, 2016

These tutorials are from 2014. While they provide a good overview of R syntax, a lot has been added to the R-verse such as dplyr, which the author primarily used for his Trump Tweets blog post yesterday.

If you are interested in learning R, you may want to read the R for Data Science book (http://r4ds.had.co.nz/) book by dplyr (and ggplot2) author Hadley Wickham.

Relatedly, I have my own (slightly more complicated) notebooks using R/dplyr/ggplot2, open-sourced on GitHub, if you want further examples of real-world analysis with publically-available data along the lines of the Trump Tweet analysis:

Processing Stack Overflow Developer data: https://github.com/minimaxir/stack-overflow-survey/blob/mast...

Identifying related Reddit Subreddits: https://github.com/minimaxir/subreddit-related/blob/master/f...

Determining correlation between genders of lead actors of movies on box office revenue: https://github.com/minimaxir/movie-gender/blob/master/movie_...

bokstavkjeksonJune 13, 2018

It's also worth noting that R becomes much more pleasurable with the Tidyverse libraries. The pipe alone makes everything more readable.

I'm also coming from more of an office setting where everything is in Excel. I've used R to reorganize and tidy up Excel files a lot. Ggplot2 (part of the Tidyverse) is also fantastic for plotting, the grammar of graphics makes it really easy to make nice and slightly complex graphs. Compared to my Matplotlib experiences, it's night and day. Though I'd expect my experience with programming to be quite different from others' though, mainly because any code I write is basically an intermediary step before the output goes back in Excel.

That said, if anyone's interested in learning R from a beginner's level, I can recommend the book R for Data Science. It's available freely at http://r4ds.had.co.nz/ and the author also wrote ggplot2, RStudio, and several of the other Tidyverse libraries.

EDIT: I'm also currently writing my master's thesis in RMarkdown with the Thesisdown package. It's wonderful, it allows for using Latex without really knowing Latex which is great for us in business school.

minimaxironJune 9, 2017

For statistical programming, since we're talking about R, I strongly recommend R for Data Science (http://r4ds.had.co.nz) by Hadley Wickham (who created a large amount of the R packages that are very commonly used [tidyverse] and incidentally also now works for RStudio)

A good book on statistical theory is harder to come by, though.

a_bonoboonDec 8, 2017

Francois Cholet (author of Keras)'s Deep Learning with Python will be complete and fully published on the 20th of December: https://www.manning.com/books/deep-learning-with-python The chapters released so far are very good, the outlook chapters were extensively discussed on HN: https://blog.keras.io/the-limitations-of-deep-learning.html and https://blog.keras.io/the-future-of-deep-learning.html

Wickham's R for Data Science came out in December 2016, I'd like to pretend that counts as 2017: http://r4ds.had.co.nz/
It's a very complete introduction to the tidyverse which makes working in R much more pleasurable.

The second edition of Python for Data Science came out in October 2017 - that one focuses on Pandas, numpy and Jupyter notebooks, reasonably good introduction to those libraries.

The second edition of Sebastian Raschka's Python Machine Learning came out 2017 too - that one focuses more on scikit-learn and tensorflow, have only heard good things but haven't read much in it.

minimaxironAug 17, 2016

This tutorial is much more basic and has much less practical statistical applications than the R tutorial posted last week (https://news.ycombinator.com/item?id=12264360), which itself is out-of-date relative to the R for Data Science book (http://r4ds.had.co.nz/)

I really am curious why anything "R" and "Tutorial" gets massively upvoted to the Top 3 of HN like clockwork nowadays. I might have to restart my R tutorial screencasts since there appears to be a demand. :P

minimaxironSep 16, 2016

R for Data Science is the canonical source for learning R and other real-world R tools such as dplyr/tidyr/ggplot2, and one I've recommended on HN submissions about R tutorials which simply go over primitative data types and out-of-date packages. (It's one of the reasons I've postponed making R tutorials myself, since the book would be better/more accurate in all circumstances.)

mtzetonJuly 14, 2017

As a rookie trying to get into the field myself, I think there are quite a few ways to start about it.

The programming part with R, python, julia etc., seems to get the most attention here. I think the most important part here is to learn how to load datasets into your system of choice and work with them to get some nice plots out. The book "R for data science"[1] seems like a good intro for this with R and tidyverse.

Somewhat more overlooked here, are the statistical models. I second the recommendation of "Introduction to Statistical Learning"[2], possibly supplemented with it's big brother "Elements of Statistical Learning"[3] if you're more mathematically inclined and want more details. I like their emphasis on starting with simple models and working your way up. I also found their discussion on how to go from data to a mathematical model very lucid.

[1] http://r4ds.had.co.nz/

[2] http://www-bcf.usc.edu/~gareth/ISL/

[3] http://web.stanford.edu/~hastie/ElemStatLearn/

alexhutchesononNov 4, 2019

To keep them engaged you need some applied project work with real (although preferably pre-cleaned) data sets and a real software environment. In my college courses I would lose focus in the lectures when equation after equation was presented and explained, but the homework projects with real data forced me to learn and internalize how to use different statistical tools (although it didn't really teach me how they are implemented).

In my college econometrics courses we used Stata for this, but I'd probably recommend R if you have a choice. The book "R for Data Science"[1] is really good for teaching the basics of data manipulation, graphing, and running regressions. However, it's not a statistics book - you'd need to consider it a "supplement" to teach applied skills. You'd also want to skip the chapters that focus on cleaning data, programming, etc.

[1] https://r4ds.had.co.nz/

Built withby tracyhenry

.

Follow me on