R tutorial - Learn How to Subset, Extend & Sort Data Frames in R

  Рет қаралды 182,041

DataCamp

DataCamp

8 жыл бұрын

Explore how you can subset, extend and sort your data frames in R.
Join DataCamp today, and start our interactive intro to R programming tutorial for free: www.datacamp.com/courses/free...
The data frame is somewhere on the intersection between matrices and lists. To subset a dataframe you can thus use subsetting syntax from both matrices and lists. On the one hand, you can use the single brackets from matrix subsetting, while you can also use the double brackets and dollar sign notation that you use to select list elements.
We'll continue with the data frame that contained some information on 5 persons. Have another look at its definition here.
Let's start with selecting single elements from a data frame. To select the age of Frank, who is on row 3 in the data frame, you can use the exact same syntax as for matrix subsetting: single brackets with two indices inside. The row, index 3, comes first, and the column, index 2, comes second:
Indeed, Frank is 21 years old. You can also use the column names to refer to the columns of course:
Just as for matrices, you can also choose to omit one of the two indices or names, to end up with an entire row or an entire column. If you want to have all information on Frank, you can use this command:
The result is a data frame with a single observation, because there has to be a way to store the different types.
On the other hand, to get the entire age column, you could use this command:
Here, the result is a vector, because columns contain elements of the same type.
Subsetting the data frame to end up with a sub data frame that contains multiple observations also works just as you'd expect. Have a look at this command, that selects the age and parenting information on Frank and Cath:
All of these examples show that you can subset data frames exactly as you did with matrices. The only difference occurs when you specify only one index inside `people`. In the matrix case, R would go through each column from left to right to find the index you specified. In the data frame case, you simply end up with a new data frame, that only contains the column you specified. This command, for example, gives the age column as a data.frame. I repeat: a data.frame, not a vector!
Why so? Let me talk about subsetting data.frames with list syntax and it'll all become clear. Remember when I told that a data frame is actually a list containing all vectors of the same length? This means that you can also use the list syntax to select elements. Say, for example, you typed people dollar sign age:
The age vector inside the data frame gets returned, so you end up with the age column. Likewise, you can use the double brackets notation with a name ... or with an index.
In all cases, the result is a vector.
You can also use single brackets to subset lists, but this generates a new list, containing only the specified elements. Take this command for example:
The result is still a data frame, which is a list, but this time containing only the "age" element. This explains why before, this command gave a data frame instead of vector.
Again, using single brackets or double brackets to subset data structures can have serious consequences, so always think about what you're dealing with and how you should handle it.
Once you know how to correctly subset data frames, extending those data frames is pretty simple. Sometimes, you'll want to add a column, a new variable, to your data frame. Other times, it's also useful to add new rows, so new observations, to your data frame.
To add a column, which actually comes down to adding a new element to the list, you can use the dollar sign or the double square brackets. Suppose you want to add a column `height`, the information of which is already in a vector `height`. This call ... Or this call ... Will do the trick.
You can also use the `cbind()` function that you've learned to build and extend matrices. It works just the same for data.frames. To add a weight column, in kilograms, for example.
If `cbind()` works, than surely `rbind()` will work fine as well. Indeed, you can use `rbind()` to add new rows to your observations. Suppose you want to add the information of another person, Tom, to the data frame. Simply creating a vector with the name, age, height etc, won't work, because a vector can't contain elements of different types. You'll have to create a new data frame containing only a single observation, and add that to the data frame using rbind. Let's call this mini data frame `tom`.
Now, we can use `rbind()` to bind `people` and `tom` together:
Wait, what? R throws an error. Names do not match previous names. This means that the names in `people` and `tom` do not match. We'll have to improve our definition of `tom` to make the merge successful:
Now, `rbind()` will work as you'd want it to work.
So adding a column to a data frame is pretty easy, but adding new observations requires some care.

Пікірлер: 37
@MaxYari
@MaxYari 10 ай бұрын
Great illustration of one thing that infinitely frustrates me in R, return types are all over the place and since its not a strongly typed language the only way to not mess it up is to memorise it all. You are never 100% sure what will be an outcome of some operations unless you have a lot of experience with R. Feels like they were designing it based on the idea of "we have to make it work" rather than "we have to make it usable".
@heidipeterson4723
@heidipeterson4723 4 жыл бұрын
I highly recommend DataCamp. it is just so much information it is hard to keep track of.
@posoot
@posoot 3 жыл бұрын
Your tutorials are amazing dude, please keep doing what you're doing. You make learning R incredibly simple
@chsa9875
@chsa9875 2 жыл бұрын
very good exercises as well as tutorials you guys did one hell of a job, thank you for posting these videos
@MostafaMASLOUHI
@MostafaMASLOUHI 3 жыл бұрын
Clear and direct. BRAVO !
@aaronmay-zhang3555
@aaronmay-zhang3555 6 жыл бұрын
Fantastic video- extremely helpful!! :D
@vladimirmisin1876
@vladimirmisin1876 6 жыл бұрын
the sarcasm :D
@pronoyroy6657
@pronoyroy6657 5 жыл бұрын
Thank you so much 😁😁
@danihlzita2345
@danihlzita2345 3 жыл бұрын
you rock mate thanks for help
@DriggerGT3
@DriggerGT3 5 жыл бұрын
thank. you.
@Orange-xw4lt
@Orange-xw4lt 3 жыл бұрын
Hi, good video but If I have a wave how can I take and separate the values ​​of the crests starting from a certain threshold?
@CharlesKongTenFree
@CharlesKongTenFree 3 жыл бұрын
you saved my butt
@atrajitkolay3764
@atrajitkolay3764 5 жыл бұрын
How would I split a column from a dataset into rows in R?
@ielyzab2168
@ielyzab2168 4 жыл бұрын
Hello, please help to find how to create dataframe with different numbers of rows ?
@poonambajaj17
@poonambajaj17 5 жыл бұрын
If we have 5000 observations in weight and have to add up this new column. Should we type all these like you did in video? If yes it will be tough enough
@cheydinal5401
@cheydinal5401 5 жыл бұрын
How can I extract only the elements where "child"=FALSE, for example? What if I wanted to create a plot with two bars: one showing the age of all the individual adults and one of the children. Could I do that?
@JonesDawg
@JonesDawg 4 жыл бұрын
with people[people$child == FALSE]
@neginkord5057
@neginkord5057 5 жыл бұрын
how can I extract the top and bottom 1/3 of my data? I have the 33rd and 66th percentile score of my dataset, but need a formula to only look at the top bottom 1/3 to run a correlation
@harshabhogle1020
@harshabhogle1020 5 жыл бұрын
Hi I want to add every 6th row in data with 55845 observations , what will be the code
@AshleyTyagi
@AshleyTyagi 3 жыл бұрын
at 5:59 can you do people$age
@AnilKumar-et4vg
@AnilKumar-et4vg 4 жыл бұрын
how to get bottom 10 values of a column in dataframe?
@jpan4438
@jpan4438 3 жыл бұрын
Nice tutorial. Could be better if how to subset the rows with age
@kavyanjanavankayala
@kavyanjanavankayala 5 жыл бұрын
I want quick sort program in r with explanation
@gosha5198
@gosha5198 7 жыл бұрын
Thanks for the lessons. Could you please advise how can I create subset of data frame choosing only let's say one of the column ID=10:25 (i.e. from 10 to 25)? Or mixed ones ID = 2,3,15:20? thanks in advance.
@km2052
@km2052 7 жыл бұрын
i can help
@harshpathak6512
@harshpathak6512 6 жыл бұрын
just specify the name of the row for example dataframe["row10", 10:25] that will print cloumns from 10:25 of a particular row or just do it without row
@abhishekchakrabarty2930
@abhishekchakrabarty2930 3 жыл бұрын
how to approach that if I am given to find all the observations of age of person greater than 21
@kunjjain3171
@kunjjain3171 6 жыл бұрын
Where are the exercises?
@cheydinal5401
@cheydinal5401 5 жыл бұрын
How do I delete a row or column?
@bazi4517
@bazi4517 5 жыл бұрын
To delete a column and rows is rather simple. First, you need to use square brackets to manipulate and extract certain observations in your dataset. So, for example, to delete a column in your dataset you would call out the name of the dataset, use the brackets and create a vector with a minus sign of the column you are looking to delete. Same with deleting a row except you use a comma after you create the vector. See below. example for deleting column : object=dataname[-c(5)] example for deleting row: object=dataname[-c(5),] hope that helps.
@darkovulin1114
@darkovulin1114 3 жыл бұрын
Jumping guy :)
@heidipeterson4723
@heidipeterson4723 4 жыл бұрын
This is in the introduction course? How about assigning vectors to dataframes
@mukku1800
@mukku1800 6 жыл бұрын
why Bounce bro ;)
@BillyTheKid380
@BillyTheKid380 4 жыл бұрын
lmfao
@miccoligno1
@miccoligno1 4 жыл бұрын
thanks Cristiano Ronaldo ;)
@haiyingwang5987
@haiyingwang5987 3 жыл бұрын
Is that Flemish accent I'm hearing?
@quanttube
@quanttube 4 жыл бұрын
why do you dance when you speak?
R Tutorial - Making Basic Graphics in R
5:39
DataCamp
Рет қаралды 87 М.
R tutorial - Learn How to Subset Matrices in R
5:38
DataCamp
Рет қаралды 87 М.
Must-have gadget for every toilet! 🤩 #gadget
00:27
GiGaZoom
Рет қаралды 12 МЛН
🌊Насколько Глубокий Океан ? #shorts
00:42
1❤️
00:17
Nonomen ノノメン
Рет қаралды 10 МЛН
Жайдарман | Туған күн 2024 | Алматы
2:22:55
Jaidarman OFFICIAL / JCI
Рет қаралды 1,5 МЛН
Nested data frames and multiple models in R
18:02
Equitable Equations
Рет қаралды 10 М.
R Tutorial - Using the Data Frame in R
5:24
DataCamp
Рет қаралды 219 М.
R Tutorial - How to Subset & Extend Lists in R
5:26
DataCamp
Рет қаралды 47 М.
Create a subset data using R; subset() in R; filter function from dplyr
9:16
Data Science with Yan
Рет қаралды 30 М.
R programming for beginners - statistic with R (t-test and linear regression) and dplyr and ggplot
15:49
Writing Your Own Functions in R: Introduction
12:45
weecology
Рет қаралды 43 М.
DATA FRAME  (Creation & Accessing Elements) IN R - PROGRAMMING || DATA STRUCTURES IN R
12:37
R tutorial - Using Factors in R
5:40
DataCamp
Рет қаралды 207 М.
Must-have gadget for every toilet! 🤩 #gadget
00:27
GiGaZoom
Рет қаралды 12 МЛН