mercredi 25 août 2021

Create loop and save output in .txt or .fasta file

I have an data set with 7754 obs. and 5 variables

name    protein     mutation_CDS    mutation_nonCDS     seq
B*07    02                01               01           ATGCTGGTCATGGCGCCCCGAACCGTCCTCCTGCTGCTCTCGG
B*07    02                01               48           ATGCTGGTCATGGCGCCCCGAACCGTCCTCCTGCTGCTCTCGG
...
B*18    153               NA               NA           ATGCGGGTCACGGCGCCCCGAACCCTCCTCCTGCTGCTCTGGG
B*18    155               NA               NA           ATGCGGGTCACGGCGCCCCGAACCCTCCTCCTGCTGCTCTGGG
...

Within the "name" variable I have 36 different names with different numbers of rows having this name "B07" "B08" "B13" "B14" "B15" "B18" "B27" "B35" "B37" "B38" "B39" "B40" "B41" "B42" "B44" "B45" "B46" "B47" "B48" "B49" "B50" "B51" "B52" "B53" "B54" "B55" "B56" "B57" "B58" "B59" "B67" "B73" "B78" "B81" "B82" "B83"

My idea was, to create a loop within R in which I take the whole data set with its 7754 obs., look for the unique name ("B*07") and store all rows (including columns) having this name and stored it in a new data set with the lable "B07" OR directly create a .txt (or .fasta) file.

I would end up with separate files, for example a file called B07 containing only the B*07 information (620 obs. and 5 variables)

name    protein     mutation_CDS    mutation_nonCDS     seq
B*07    02                01              01            ATGCTGGTCATGGCGCCCCGAACCGTCCTCCTGCTGCTCTCGG
B*07    02                01              48            ATGCTGGTCATGGCGCCCCGAACCGTCCTCCTGCTGCTCTCGG

and a file called B08 containing only the B*08 information (X obs. and 5 variables)

B*18    153     NA      NA          ATGCGGGTCACGGCGCCCCGAACCCTCCTCCTGCTGCTCTGGG
B*18    155     NA      NA          ATGCGGGTCACGGCGCCCCGAACCCTCCTCCTGCTGCTCTGGG

In simple words, I want to split the 7754 observations in their parts referring to a certain name pattern

"B*07" 620 of 7754

"B*08" X of 7754

"B*13" X of 7754

etc.

does anyone have an Idea how to do that?

Aucun commentaire:

Enregistrer un commentaire