admin 管理员组

文章数量: 1086019

I have data that I download on a monthly basis to capture point in time amounts. Each time we download this data, we get data tracing back to the last 12 months. If we downloaded this data in February 2023, then col_00 will be any data for Feb 2023 and Col_01 for Jan 2023, and Col_02 for Dec 2022. This keeps going in order until Col_13 which would be Feb 2022. I am not sure how to go about it, but I am trying to figure out a way to dynamically label these columns based on the date of download from a variable I give it, if possible. The reason being that I am not always looking at the most recent download but may need to pull the extract from Aug 2021 tomorrow and then the col names need to change based on that date of download.

raw_df <- data.frame(ID = '1',
                     col_00 = '2',
                     col_01 = '3',
                     col_02 = '3.5',
                     col_03 = '4',
                     col_04 = '5',
                     col_05 = '6',
                     col_06 = '7',
                     col_07 = '8',
                     col_08 = '9',
                     col_09 = '10',
                     col_10 = '11',
                     col_11 = '12',
                     col_12 = '13',
                     col_13= '14'
)

Right now I am pulling the last 3 year's worth of data from march and tried making an abbreviation from a number to then pull that month but this runs into trouble when going into the previous year.

mnt_var <- 3

> month.abb[3]
[1] "Mar"

> col_00 <- month.abb[3]
[1] "Mar"

#by subtracting I was able to get Feb and Jan but as soon as I tried to get further
#than this because it then being 0 it did not read it as a month
> col_04 <- month.abb[3-3]
character(0)

This method would only work if data was from let's say December. I have provided the desired output if we keep with the example of march download. I have also tried manually using the rename function to do it manually for each download but since I am processing the last 3 year's worth of data I am applying this to 3 df from different years located in list for faster outputs.

edited_df <- data.frame(ID = '1',
                     Mar25 = '2',
                     Feb25 = '3',
                     Jan25 = '3.5',
                     Dec24 = '4',
                     Nov24 = '5',
                     Oct24 = '6',
                     Sep24 = '7',
                     Aug24 = '8',
                     Jul24 = '9',
                     Jun24 = '10',
                     May24 = '11',
                     Apr24 = '12',
                     Mar24 = '13'
                     Feb24 = '14'
)

I have data that I download on a monthly basis to capture point in time amounts. Each time we download this data, we get data tracing back to the last 12 months. If we downloaded this data in February 2023, then col_00 will be any data for Feb 2023 and Col_01 for Jan 2023, and Col_02 for Dec 2022. This keeps going in order until Col_13 which would be Feb 2022. I am not sure how to go about it, but I am trying to figure out a way to dynamically label these columns based on the date of download from a variable I give it, if possible. The reason being that I am not always looking at the most recent download but may need to pull the extract from Aug 2021 tomorrow and then the col names need to change based on that date of download.

raw_df <- data.frame(ID = '1',
                     col_00 = '2',
                     col_01 = '3',
                     col_02 = '3.5',
                     col_03 = '4',
                     col_04 = '5',
                     col_05 = '6',
                     col_06 = '7',
                     col_07 = '8',
                     col_08 = '9',
                     col_09 = '10',
                     col_10 = '11',
                     col_11 = '12',
                     col_12 = '13',
                     col_13= '14'
)

Right now I am pulling the last 3 year's worth of data from march and tried making an abbreviation from a number to then pull that month but this runs into trouble when going into the previous year.

mnt_var <- 3

> month.abb[3]
[1] "Mar"

> col_00 <- month.abb[3]
[1] "Mar"

#by subtracting I was able to get Feb and Jan but as soon as I tried to get further
#than this because it then being 0 it did not read it as a month
> col_04 <- month.abb[3-3]
character(0)

This method would only work if data was from let's say December. I have provided the desired output if we keep with the example of march download. I have also tried manually using the rename function to do it manually for each download but since I am processing the last 3 year's worth of data I am applying this to 3 df from different years located in list for faster outputs.

edited_df <- data.frame(ID = '1',
                     Mar25 = '2',
                     Feb25 = '3',
                     Jan25 = '3.5',
                     Dec24 = '4',
                     Nov24 = '5',
                     Oct24 = '6',
                     Sep24 = '7',
                     Aug24 = '8',
                     Jul24 = '9',
                     Jun24 = '10',
                     May24 = '11',
                     Apr24 = '12',
                     Mar24 = '13'
                     Feb24 = '14'
)

Share Improve this question edited Mar 27 at 20:07 JLG_10 asked Mar 27 at 18:19 JLG_10JLG_10 234 bronze badges 2
  • (I think your edited_df is missing the original data from col_03. I expect should be one column wider and end in Feb.) – Jon Spring Commented Mar 27 at 18:36
  • Thank you @JonSpring, yes I typed it out wrong thank you for pointing it out. – JLG_10 Commented Mar 27 at 20:07
Add a comment  | 

1 Answer 1

Reset to default 3

I'd suggest using real dates so that you can do math on them (e.g. make a sequence) and then print out in whatever format you want, like "%b%y" for "Mar25".

library(lubridate)
mo_cols_n <- ncol(raw_df) - 1 # assumes months in all columns after first one
cur_mo = floor_date(Sys.Date(), "month") # assumes first header is this month; otherwise
                                         # use e.g. cur_mo = as.Date("2024-02-01")
mo_cols <- format(seq.Date(cur_mo, cur_mo %m-% months(mo_cols_n-1), "-1 month") , "%b%y")
names(raw_df)[(1:mo_cols_n) + 1] = mo_cols # rename the columns using mo_cols

> raw_df
  ID Mar25 Feb25 Jan25 Dec24 Nov24 Oct24 Sep24 Aug24 Jul24 Jun24 May24 Apr24 Mar24 Feb24
1  1     2     3   3.5     4     5     6     7     8     9    10    11    12    13    14

本文标签: Renaming columns based on when the data was downloaded in RStack Overflow