Find index of first and last occurrence in data table

问题内容:

I have a data table that looks like

|userId|36|37|38|39|40|
|1|1|0|3|0|0|
|2|3|0|0|0|1|

Where each numbered column (36-40) represent week numbers. I want to calculate the number of weeks before the 1st occurrence of a non-zero value, and the last.

For instance, for userId 1 in my dataset, the first value appears at week 36, and the last one appears at week 38, so the value I want is 2. For userId 2 it’s 40-36 which is 4.

I would like to store the data like:

|userId|lifespan|
|1|2|
|2|4|

I’m struggling to do this, can someone please help?

问题评论:

答案:

答案1:

General method I would take is to melt it, convert the character column names to numeric, and take the delta by each userID. Here is an example using data.table.

library(data.table)
dt <- fread("userId|36|37|38|39|40
            1|1|0|3|0|0
            2|3|0|0|0|1",
            header = TRUE)

dt <- melt(dt, id.vars = "userId")
dt[, variable := as.numeric(as.character(variable))]
dt
#     userId variable value
#  1:      1       36     1
#  2:      2       36     3
#  3:      1       37     0
#  4:      2       37     0
#  5:      1       38     3
#  6:      2       38     0
#  7:      1       39     0
#  8:      2       39     0
#  9:      1       40     0
# 10:      2       40     1
dt[!value == 0, .(lifespan = max(variable) - min(variable)), by = .(userId)]
#    userId lifespan
# 1:      1        2
# 2:      2        4

答案评论:

    
This is exactly what I was after, thank you!
– Benirving92
21 mins ago

答案2:

Here’s a dplyr method:

df %>%
  gather(var, value, -userId) %>%
  mutate(var = as.numeric(sub("X", "", var))) %>%
  group_by(userId) %>%
  slice(c(which.max(value!=0), max(which(value!=0)))) %>%
  summarize(lifespan = var[2]-var[1])

Result:

# A tibble: 2 x 2
  userId lifespan
   <int>    <dbl>
1      1        2
2      2        4

Data:

df = read.table(text = "userId|36|37|38|39|40
1|1|0|3|0|0
2|3|0|0|0|1", header = TRUE, sep = "|")

答案评论:

原文地址:

https://stackoverflow.com/questions/47756325/find-index-of-first-and-last-occurrence-in-data-table

Tags:

添加评论

友情链接:蝴蝶教程