admin 管理员组文章数量: 1086019
I have data that I download on a monthly basis to capture point in time amounts. Each time we download this data, we get data tracing back to the last 12 months. If we downloaded this data in February 2023, then col_00 will be any data for Feb 2023 and Col_01 for Jan 2023, and Col_02 for Dec 2022. This keeps going in order until Col_13 which would be Feb 2022. I am not sure how to go about it, but I am trying to figure out a way to dynamically label these columns based on the date of download from a variable I give it, if possible. The reason being that I am not always looking at the most recent download but may need to pull the extract from Aug 2021 tomorrow and then the col names need to change based on that date of download.
raw_df <- data.frame(ID = '1',
col_00 = '2',
col_01 = '3',
col_02 = '3.5',
col_03 = '4',
col_04 = '5',
col_05 = '6',
col_06 = '7',
col_07 = '8',
col_08 = '9',
col_09 = '10',
col_10 = '11',
col_11 = '12',
col_12 = '13',
col_13= '14'
)
Right now I am pulling the last 3 year's worth of data from march and tried making an abbreviation from a number to then pull that month but this runs into trouble when going into the previous year.
mnt_var <- 3
> month.abb[3]
[1] "Mar"
> col_00 <- month.abb[3]
[1] "Mar"
#by subtracting I was able to get Feb and Jan but as soon as I tried to get further
#than this because it then being 0 it did not read it as a month
> col_04 <- month.abb[3-3]
character(0)
This method would only work if data was from let's say December. I have provided the desired output if we keep with the example of march download. I have also tried manually using the rename function to do it manually for each download but since I am processing the last 3 year's worth of data I am applying this to 3 df from different years located in list for faster outputs.
edited_df <- data.frame(ID = '1',
Mar25 = '2',
Feb25 = '3',
Jan25 = '3.5',
Dec24 = '4',
Nov24 = '5',
Oct24 = '6',
Sep24 = '7',
Aug24 = '8',
Jul24 = '9',
Jun24 = '10',
May24 = '11',
Apr24 = '12',
Mar24 = '13'
Feb24 = '14'
)
I have data that I download on a monthly basis to capture point in time amounts. Each time we download this data, we get data tracing back to the last 12 months. If we downloaded this data in February 2023, then col_00 will be any data for Feb 2023 and Col_01 for Jan 2023, and Col_02 for Dec 2022. This keeps going in order until Col_13 which would be Feb 2022. I am not sure how to go about it, but I am trying to figure out a way to dynamically label these columns based on the date of download from a variable I give it, if possible. The reason being that I am not always looking at the most recent download but may need to pull the extract from Aug 2021 tomorrow and then the col names need to change based on that date of download.
raw_df <- data.frame(ID = '1',
col_00 = '2',
col_01 = '3',
col_02 = '3.5',
col_03 = '4',
col_04 = '5',
col_05 = '6',
col_06 = '7',
col_07 = '8',
col_08 = '9',
col_09 = '10',
col_10 = '11',
col_11 = '12',
col_12 = '13',
col_13= '14'
)
Right now I am pulling the last 3 year's worth of data from march and tried making an abbreviation from a number to then pull that month but this runs into trouble when going into the previous year.
mnt_var <- 3
> month.abb[3]
[1] "Mar"
> col_00 <- month.abb[3]
[1] "Mar"
#by subtracting I was able to get Feb and Jan but as soon as I tried to get further
#than this because it then being 0 it did not read it as a month
> col_04 <- month.abb[3-3]
character(0)
This method would only work if data was from let's say December. I have provided the desired output if we keep with the example of march download. I have also tried manually using the rename function to do it manually for each download but since I am processing the last 3 year's worth of data I am applying this to 3 df from different years located in list for faster outputs.
edited_df <- data.frame(ID = '1',
Mar25 = '2',
Feb25 = '3',
Jan25 = '3.5',
Dec24 = '4',
Nov24 = '5',
Oct24 = '6',
Sep24 = '7',
Aug24 = '8',
Jul24 = '9',
Jun24 = '10',
May24 = '11',
Apr24 = '12',
Mar24 = '13'
Feb24 = '14'
)
Share
Improve this question
edited Mar 27 at 20:07
JLG_10
asked Mar 27 at 18:19
JLG_10JLG_10
234 bronze badges
2
|
1 Answer
Reset to default 3I'd suggest using real dates so that you can do math on them (e.g. make a sequence) and then print out in whatever format you want, like "%b%y" for "Mar25".
library(lubridate)
mo_cols_n <- ncol(raw_df) - 1 # assumes months in all columns after first one
cur_mo = floor_date(Sys.Date(), "month") # assumes first header is this month; otherwise
# use e.g. cur_mo = as.Date("2024-02-01")
mo_cols <- format(seq.Date(cur_mo, cur_mo %m-% months(mo_cols_n-1), "-1 month") , "%b%y")
names(raw_df)[(1:mo_cols_n) + 1] = mo_cols # rename the columns using mo_cols
> raw_df
ID Mar25 Feb25 Jan25 Dec24 Nov24 Oct24 Sep24 Aug24 Jul24 Jun24 May24 Apr24 Mar24 Feb24
1 1 2 3 3.5 4 5 6 7 8 9 10 11 12 13 14
本文标签: Renaming columns based on when the data was downloaded in RStack Overflow
版权声明:本文标题:Renaming columns based on when the data was downloaded in R - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://roclinux.cn/p/1744074238a2529049.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
edited_df
is missing the original data from col_03. I expect should be one column wider and end in Feb.) – Jon Spring Commented Mar 27 at 18:36