Пікірлер
@justalittlecod
@justalittlecod Күн бұрын
Is there a way to turn the row headers into data values in your data table? i pulled data in and where you have 1,2,3 etc. my dataset has dates but i would like it to become its own column like num1 or num2. is that possible? new to r so anything helps!
@StatisticsGlobe
@StatisticsGlobe 14 сағат бұрын
Hey, are you looking for this? statisticsglobe.com/convert-row-names-into-column-of-data-frame-in-r
@anitagodwin8067
@anitagodwin8067 2 күн бұрын
This has really helped me Thank you
@tomasbeltran04050
@tomasbeltran04050 5 күн бұрын
Is solve better than inv, or is it just my impression?
@StatisticsGlobe
@StatisticsGlobe 14 сағат бұрын
Hey, I haven’t done any direct comparisons, but I believe using solve() for this purpose is standard practice in R.
@dimitriveldkornet6708
@dimitriveldkornet6708 5 күн бұрын
Very well explained. Can I do correspondence analysis using the same package?
@StatisticsGlobe
@StatisticsGlobe 14 сағат бұрын
Thanks! I recommend using the FactoMineR package for correspondence analysis. It’s robust, widely used, and works seamlessly with factoextra for visualizations.
@usmansheikh-v1l
@usmansheikh-v1l 7 күн бұрын
I am facing this error. Can you please help in resolving this error? Error in inputm[, cached.keys(k, rel)] : subscript out of bounds Calls: klink2 -> infer -> conn.vector -> conn_vector_C Execution halted
@StatisticsGlobe
@StatisticsGlobe 6 күн бұрын
Hey, could you please share your code and the structure of your data?
@usmanjavedsheikh495
@usmanjavedsheikh495 6 күн бұрын
@@StatisticsGlobesure can i send you via email?
@StatisticsGlobe
@StatisticsGlobe 5 күн бұрын
It would be great if you could post it here, so that others can read/contribute as well. Thanks!
@usmansheikh-v1l
@usmansheikh-v1l 5 күн бұрын
@@StatisticsGlobe this is my input.R file # Prepare dataset as input to Klink-2. data_dir <- "../../data/klink2/" source("relations.R") Rcpp::sourceCpp("utils.cpp") # input relations taken into consideration relations <- c("publication", "author", "venue", "area") # number of relations rn <- length(relations) # length of saved cooccurrence values m <- 100 # 4 input variables to Klink-2: keywordsdb <- new.env(parent = globalenv(), hash = TRUE) reldb_df <- list() reldb_l <- list() inputm <- matrix(, nrow = m, ncol = 0) # reads dataset into global raw input variables: keywordsdb, reldb_df, reldb_l read_dataset <- function(limit = -1, named_list) { file_name <- paste(data_dir, named_list, "/", named_list, ".tsv", sep = "") d <- read.csv( file_name, sep = "\t", header = TRUE, stringsAsFactors = FALSE, nrows = limit * 2 ) empty <- function(elem) is.null(elem) || elem == "" || all(trimws(elem) == "") # need only items with defined keywords fields (DE, ID) w <- c() for (i in 1:dim(d)[1]) { if (!(empty(d$DE[i]) && empty(d$ID[i]))) w <- c(i, w) } d <- d[w, ] if (limit > 0) d <- d[1:limit, ] # fields used: document title, authors, publication name, research areas, year d <- d[c("DE", "TI", "AU", "SO", "SC", "PY")] k <- 1 a <- 1 n <- nrow(d) process_item <- function(item) { cat("process article ", a, " out of ", n, " ") a <<- a + 1 # Process keywords with error handling keywords <- tryCatch({ unique(unlist( sapply(strsplit(tolower(item["DE"]), ";"), function(x) { x <- trimws(x) x[nzchar(x)] # Only keep non-zero length strings }) )) }, error = function(e) character(0)) # Skip if no valid keywords if (length(keywords) == 0 || all(keywords == "")) return() newkeywords <- setdiff(keywords, ls(keywordsdb)) oldkeywords <- setdiff(keywords, newkeywords) # Process other fields with error handling authors <- tryCatch({ res <- unlist(sapply(strsplit(item["AU"], ";"), function(x) trimws(x[nzchar(x)]))) if (is.null(res)) character(0) else as.vector(res) }, error = function(e) character(0)) areas <- tryCatch({ res <- unique(unlist(sapply(strsplit(tolower(item["SC"]), ";"), function(x) trimws(x[nzchar(x)])))) if (is.null(res)) character(0) else res }, error = function(e) character(0)) venues <- if (!empty(item["SO"])) tolower(trimws(item["SO"])) else character(0) # Create relation vectors only for non-empty elements relation <- c(1, rep(2, length(authors)), 3, rep(4, length(areas))) entity <- c(item["TI"], authors, venues, areas) quantity <- rep(NA_integer_, length(entity)) year <- as.numeric(rep(item["PY"], length(entity))) # Process new keywords for (i in seq_along(newkeywords)) { if (nzchar(newkeywords[i])) { # Only process non-empty keywords keywordsdb[[newkeywords[i]]] <<- k reldb_l[[k]] <<- list() reldb_l[[k]]$publication <<- paste(item["TI"], year[1], sep = "_") reldb_l[[k]]$author <<- if(length(authors) > 0) vapply(authors, paste, "", year[1], sep = "_") else character(0) reldb_l[[k]]$venue <<- if(length(venues) > 0) paste(venues, year[1], sep = "_") else character(0) reldb_l[[k]]$area <<- if(length(areas) > 0) vapply(areas, paste, "", year[1], sep = "_") else character(0) names(reldb_l)[k] <<- newkeywords[i] reldb_df[[k]] <<- data.frame(relation, entity, quantity, year, stringsAsFactors = FALSE) names(reldb_df)[k] <<- newkeywords[i] k <<- k + 1 } } # Process existing keywords for (i in seq_along(oldkeywords)) { if (nzchar(oldkeywords[i])) { index <- keywordsdb[[oldkeywords[i]]] reldb_l[[index]]$publication <<- c(reldb_l[[index]]$publication, paste(item["TI"], year[1], sep = "_")) reldb_l[[index]]$author <<- unique(c(reldb_l[[index]]$author, if(length(authors) > 0) vapply(authors, paste, "", year[1], sep = "_") else character(0))) reldb_l[[index]]$venue <<- unique(c(reldb_l[[index]]$venue, if(length(venues) > 0) paste(venues, year[1], sep = "_") else character(0))) reldb_l[[index]]$area <<- unique(c(reldb_l[[index]]$area, if(length(areas) > 0) vapply(areas, paste, "", year[1], sep = "_") else character(0))) reldb_df[[index]] <<- rbind(reldb_df[[index]], data.frame(relation, entity, quantity, year, stringsAsFactors = FALSE)) } } } apply(d, 1, process_item) # Sort the fields in reldb_l for (i in 1:length(reldb_l)) { if (!is.null(reldb_l[[i]]$publication)) reldb_l[[i]]$publication <<- sort(reldb_l[[i]]$publication) if (!is.null(reldb_l[[i]]$author)) reldb_l[[i]]$author <<- sort(reldb_l[[i]]$author) if (!is.null(reldb_l[[i]]$venue)) reldb_l[[i]]$venue <<- sort(reldb_l[[i]]$venue) if (!is.null(reldb_l[[i]]$area)) reldb_l[[i]]$area <<- sort(reldb_l[[i]]$area) } } # Rest of the functions remain unchanged entities_range <- function(reldb_l) { a <- Inf b <- 0 for (i in 1:length(names(reldb_l))) { for (r in 1:rn) { t <- length(reldb_l[[i]][[r]]) if (t < a) a <- t if (t > b) b <- t } } c(a, b) } cache_cooccurrence <- function() { n <- length(ls(keywordsdb)) inputm <- matrix(0, nrow = m, ncol = n * 2 * rn) maxsize <- entities_range(reldb_l)[2] for (i in 1:n) { cat("process keyword ", i, " out of ", n, " ") irel <- reldb_l[[i]] for (r in 1:rn) { co_m <- calc_cooccurrence_C(n, m, i, r, reldb_l, maxsize) inputm[, cached.keys(i, r)] <- co_m[, 1] inputm[, cached.values(i, r)] <- co_m[, 2] } } inputm } run_all <- function(limit = -1, named_list) { keywordsdb <<- new.env(parent = globalenv(), hash = TRUE) reldb_df <<- list() reldb_l <<- list() read_dataset(limit, named_list) inputm <<- cache_cooccurrence() if (limit > 0) { fname <- paste(data_dir, named_list, "/", named_list, limit, ".Rdata", sep = "") } else { fname <- paste(data_dir, named_list, "/", named_list, ".Rdata", sep = "") } save("reldb_df", "reldb_l", "keywordsdb", "inputm", file = fname) cat("Input variables saved to", fname, " ") } inspect_dataset <- function(filename) { file_name <- paste(data_dir, filename, "/", filename, ".Rdata", sep = "") load(file_name) n <- length(reldb_df) entrange <- entities_range(reldb_l) cat("Number of keywords: ", n, " ") #cat("Range of entities associated with a keyword: [", entrange[1], ",", entrange[2], "] ") for (i in 1:rn) { cat( "Value of co-occurrence,", relations[i], "relation: [", min(inputm[, cached.values(1:n, i)]), ",", max(inputm[, cached.values(1:n, i)]), "] " ) } }
@usmansheikh-v1l
@usmansheikh-v1l 5 күн бұрын
i got above error on this step¨Run the Klink-2 Scripts¨
@homaanrandm9441
@homaanrandm9441 9 күн бұрын
Hi, I have a dataset what I want to do the descriptive analysis or logistic regression. can you help. if yes then we can talk.
@StatisticsGlobe
@StatisticsGlobe 8 күн бұрын
Hey, I run a Facebook discussion group where people can ask questions about R programming and statistics. Could you post your question there? This way, others can contribute/read as well: facebook.com/groups/statisticsglobe For more detailed, personalized assistance, you may also take a look at our consulting page: statisticsglobe.com/consulting
@123peterjackson
@123peterjackson 11 күн бұрын
question, how would I add multiple labels to an x axis. I have a 13 week trial and the 1st week is a run-in phase, the next 6 weeks are a supplement phase and teh final 6 weeks area washout phase? is that even possible
@StatisticsGlobe
@StatisticsGlobe 9 күн бұрын
Hey, how about this? library(ggplot2) # Example data data <- data.frame( week = 1:13, value = rnorm(13) ) # Custom labels with line breaks ggplot(data, aes(x = week, y = value)) + geom_line() + geom_point() + scale_x_continuous( breaks = 1:13, labels = c("1 Run-in", "2", "3", "4", "5", "6", "7 Supplement", "8", "9", "10", "11", "12", "13 Washout") )
@oscarferrerlozano1155
@oscarferrerlozano1155 13 күн бұрын
one question how can i change the dates of the seasonings? Because it considers that september is autumn but some dates in my counrty is summer
@StatisticsGlobe
@StatisticsGlobe 9 күн бұрын
Hey, you will need to define a custom function for this. The time2season function uses default boundaries for seasons, but it doesn’t allow you to directly modify those defaults. Instead, you can create your own logic for assigning seasons based on your country’s calendar. Here’s how you can do it: # Define example dates my_dates <- as.Date(c("2022-10-01", "2021-05-13", "2025-12-01", "2023-02-17", "2023-06-25", "2022-10-15")) my_dates # Print example dates # Custom function to assign seasons assign_season <- function(dates) { # Extract month and day month <- as.numeric(format(dates, "%m")) day <- as.numeric(format(dates, "%d")) # Assign seasons based on your custom boundaries ifelse((month == 12 & day >= 21) | month %in% c(1, 2) | (month == 3 & day < 21), "Winter", ifelse((month == 3 & day >= 21) | month %in% c(4, 5) | (month == 6 & day < 21), "Spring", ifelse((month == 6 & day >= 21) | month %in% c(7, 8) | (month == 9 & day < 21), "Summer", "Autumn"))) } # Apply the custom function my_seasons <- assign_season(my_dates) my_seasons # Print custom seasons
@Getalew
@Getalew 13 күн бұрын
Wonderful! Just a curiosity, are we going to test the assumption of ANOVA before we do the analysis or after as you did here in your video? Thank you
@StatisticsGlobe
@StatisticsGlobe 9 күн бұрын
Thanks! Both approaches are possible. You can test the assumptions either before or after performing ANOVA, as long as you ensure that the assumption tests are completed and any violations addressed before interpreting or sharing your results.
@Getalew
@Getalew 13 күн бұрын
I really enjoyed your lecture and it was informative. Please also include a comment to explain some codings. Thank you
@StatisticsGlobe
@StatisticsGlobe 9 күн бұрын
Thank you for the kind comment and your feedback!
@poo9poo9ca9choo
@poo9poo9ca9choo 15 күн бұрын
Good job. This was quite helpful.
@micha.statisticsglobe
@micha.statisticsglobe 14 күн бұрын
Thanks a lot for your kind feedback. Glad it helped! 🙂
@esenemrullah
@esenemrullah 15 күн бұрын
Thank you for a quick tutorial, that was quite easy compared to ggplot 2. Subscribed!
@micha.statisticsglobe
@micha.statisticsglobe 15 күн бұрын
Thank you very much for your kind feedback! 🙂
@tomspoors768
@tomspoors768 20 күн бұрын
Of course! One uses Sum to count. How silly of me! Thanks, Joachim
@StatisticsGlobe
@StatisticsGlobe 19 күн бұрын
You are welcome, glad it was helpful!
@trilisser
@trilisser 21 күн бұрын
Bro this looks like vaginas
@warcoder
@warcoder 22 күн бұрын
Hi! How I can do this with multiple curves?
@StatisticsGlobe
@StatisticsGlobe 21 күн бұрын
Hey, one solution could be to call geom_area multiple times.
@sodaerynzyrillg.7295
@sodaerynzyrillg.7295 23 күн бұрын
you are saving my group research paper right now thank you so much <3
@micha.statisticsglobe
@micha.statisticsglobe 22 күн бұрын
You're most welcome. Glad it helped! 🙂
@KarolKarasiewicz
@KarolKarasiewicz 24 күн бұрын
Patchwork and ggstats are the most useful in my opinion, but gganimate looks awesome.
@StatisticsGlobe
@StatisticsGlobe 23 күн бұрын
Thanks for sharing your insights!
@noureddineabid8167
@noureddineabid8167 25 күн бұрын
I like your videos they are short and summarise the relevant information
@micha.statisticsglobe
@micha.statisticsglobe 23 күн бұрын
Thank you very much for your kind feedback! 🙂
@WahranRai
@WahranRai 25 күн бұрын
Good work, keep using the code readable and understandable without using pipe
@StatisticsGlobe
@StatisticsGlobe 25 күн бұрын
Thanks a lot for the kind feedback! :)
@amevordoephelixkelvin3667
@amevordoephelixkelvin3667 25 күн бұрын
@StatisticsGlobe
@StatisticsGlobe 25 күн бұрын
Thanks, glad you like it!
@danielkwawuvi_tutorials
@danielkwawuvi_tutorials Ай бұрын
Thank you for the nice demonstration
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
You're most welcome! 🙂
@HuynhCamThaoTrang
@HuynhCamThaoTrang Ай бұрын
Thank you very much for your great explanation. It is extremely helpful for me!!!
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
Thanks a lot for your kind feedback. Glad it helped! 🙂
@lupen2024-il2vc
@lupen2024-il2vc Ай бұрын
Great but if we have a dataframe with many variables with outliers? Should we take "once at a time" aproach to get rid of outliers?
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Hey, for data frames with many variables containing outliers, it’s best to address outliers carefully, often one variable at a time. Rather than removing them outright, consider retaining all data and applying appropriate statistical methods to handle outliers, as they may hold valuable insights or represent unique cases.
@Gabriel-bw2in
@Gabriel-bw2in Ай бұрын
So weird to see python on r studio 😅
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Haha, indeed that's something you have to get used to.
@eliyas8915
@eliyas8915 Ай бұрын
The above solutions returning an empty dataframe data %>% dplyr::filter(x1 %in% 3:5) [1] x1 x2 x3 y z <0 rows> (or 0-length row.names) instead this one is works for me data[3:5,]
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
That's surprising. I just ran the code again and for me it works fine.
@eliyas8915
@eliyas8915 Ай бұрын
In my opinion, short videos are totally fine but this type of problems needs a detailed explanation
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Thanks for the feedback. Indeed, this is a complex topic. Let me know if you have any specific questions.
@CarlosErnestoAlvarengaSantos
@CarlosErnestoAlvarengaSantos Ай бұрын
I liked it. Thanks¡
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
Thank you very much for your kind feedback! 🙂
@eliyas8915
@eliyas8915 Ай бұрын
Your videos are very helpful for me after your answers i always try to get the same on different methods, like , i have modify the loop a little bit might helpful to the others for(i in 1:3) { data[[paste0("new", i)]] <- rep(i, nrow(data)) # Add new column directly by name } for (i in 1:2) { # Loop to append rows new_row <- rep(i, ncol(data)) # Create a new row with values of `i` data <- rbind(data, new_row) # Append the new row to `data` }
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Thank you so much for the kind comment and for sharing your code!
@SofiA-nf7os
@SofiA-nf7os Ай бұрын
Thank you so much for this video! you helped me with my thesis
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
You're most welcome. Glad it helped! 🙂
@peterwestermann5265
@peterwestermann5265 Ай бұрын
super clear, thank you very much!!
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
You're most welcome! 🙂
@samthenextgeneration
@samthenextgeneration Ай бұрын
Thank you for the video! This video was super helpful in explaining the concept.
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
Thanks a lot for your kind feedback. Glad it helped! 🙂
@khinsoratana248
@khinsoratana248 Ай бұрын
I've joined your group in Facebook, and now I have subscribed your KZbin channel. For me, no matter how much change you want in your channel I still enjoy your practical content.
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Thank you so much for the very kind feedback, this is great to hear!
@AriseDoryneKalembe-ye9fh
@AriseDoryneKalembe-ye9fh Ай бұрын
thank you so much statistics globe
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
You're most welcome! 🙂
@eliyas8915
@eliyas8915 Ай бұрын
Very Helpful! Thank you very much! If any one dont want to use qpcR:::cbind.na(vec1, vec2) you can simply transpose the data_rbind as t(bind_rows(vec1,vec2))
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Thank you for the kind comment and hint!
@KarolKarasiewicz
@KarolKarasiewicz Ай бұрын
Hi! Your videos are cool, thank You. Can you do any longer/advanced tutorial on regular expressions? It'd be grat.
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Hey, thank you so much for the kind comment and topic suggestion. I'll consider it for a future video.
@IsobelFrench
@IsobelFrench Ай бұрын
How do I add grid to a nice_violin plot? Thank you!
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Hey, could you share your code?
@rashawnhoward564
@rashawnhoward564 Ай бұрын
Why import a package for python and not R? system.time(dqrng::dqrnorm(100000000)) does this in about 1.3 secs
@CarlosErnestoAlvarengaSantos
@CarlosErnestoAlvarengaSantos Ай бұрын
Many thanks.
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
You're most welcome! 🙂
@anindabhowmik5688
@anindabhowmik5688 Ай бұрын
how can i add the dataset of 24 at the first row? plz help
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Hey, you may use the following code: rbind(new_data, old_data)
@leveluptennis5440
@leveluptennis5440 Ай бұрын
Excellent overview!
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
Thank you very much for your kind feedback! 🙂
@eliyas8915
@eliyas8915 Ай бұрын
it was helpful thanks but we dont need to specify the +-m inside the pipe you can directly do the operation as.Date("2017-05-11") + months(1) [1] "2017-06-11" this one is very straight forward
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Thank you for the kind comment and the tip! It appears this method works as well. I'm not sure why I did it differently in the video, but perhaps this functionality was added more recently. The video is already over three years old.
@zeruyimer3764
@zeruyimer3764 Ай бұрын
Thanks more, Statistics Globe, for sharing us the video and hopefully you will come with another one.
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
You're most welcome! 🙂
@darakhshannehal1828
@darakhshannehal1828 Ай бұрын
Glimpse() 🙌🏼. Nice quiz series, keep it up!
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Thanks, glad you like it! :)
@nigussiekefelegn6764
@nigussiekefelegn6764 Ай бұрын
I think it is better to change my question as follow! How can we do association analysis between discrete (nominal or categorical variable) and continuous (quantitative) variable using R? can you do one video on it? My second question: can we do association analysis between two categorical data using R? which model can be used?
@gopaltiwariful
@gopaltiwariful Ай бұрын
when in performed this code with my data R showing "Error: unexpected invalid token in "my_pca" this any suggestion?
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Hey, did you run the code exactly as demonstrated in the video?
@gopaltiwariful
@gopaltiwariful Ай бұрын
@@StatisticsGlobe Yes as it is
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
That's weird, to be honest, I don't know why this is happening. On my side, everything works fine.
@gopaltiwariful
@gopaltiwariful Ай бұрын
Error: unexpected invalid token in "my_pca
@StatisticsGlobe
@StatisticsGlobe Ай бұрын
Please see my response to your other question.
@manikandanramaraj3278
@manikandanramaraj3278 Ай бұрын
Great joachim....!!!
@elsiddigsalaheldin9364
@elsiddigsalaheldin9364 Ай бұрын
ich danke dir!!
@micha.statisticsglobe
@micha.statisticsglobe Ай бұрын
Sehr gerne! 🙂