Get the rows with a XY for a gene [on hold]

问题内容:

I’m quite new with programming and I have a question.
I have a file with multiple rows and i want to extract the rows that have a X and a Y for the same name. But my problem is that there are multipe X’s and Y’s for some names and i need to have at least 1 X and multiple Y or the other way around!

my data looks like this:

-   -   -   -   -   An.Pos  -   -   -   Name    -   - - - - 

1   678731  680107  2   8   X   1376    1   677193  685396  RP11-206L10.3   12  NA  NA  .    
1   1572876 1636342 2   4   X   63466   1   1590786 1594063 RP11-345P4.7    9   NA  NA  .    
1   1572876 1636342 2   4   Y   63466   1   1603429 1604850 RP11-345P4.7    9   NA  NA  .    
1   1572876 1636342 2   4   X   63466   1   1631369 1633249 MMP23A  9   NA  NA  .    

What I want to get is:

1   1572876 1636342 2   4   X   63466   1   1590786 1594063 RP11-345P4.7    9   NA  NA  .    
1   1572876 1636342 2   4   Y   63466   1   1603429 1604850 RP11-345P4.7    9   NA  NA  .

But in my real data it can be that RP11-345P4.7 has more than two rows.
So wat i need is the names that have at least 1 X and 1 Y.

PS. I also dont know if it is easier to do it with R or Bash, or another language.

问题评论:

原文地址:

https://stackoverflow.com/questions/47752801/get-the-rows-with-a-xy-for-a-gene

Tags:,

添加评论

友情链接:蝴蝶教程