This is just so amazing!
I have ideas that I float around all the time and generally they never go anywhere.
Well, actually they have never ever went anywhere!
Until now.
I put all the pieces of the puzzles on the table to make it easy for someone to put it together.
All it took was the final step and this idea could take off. It is the old story you need to have
as few hurdles to success as possible or something will trip people up. With this it wasn't only doing the imputation.
Impute2 will do an imputation almost right away. The important thing is to ensure that proper quality control is
done and the imputed results are correct.
Yes, I am feeling quite nice about how all of this has turned out.
Imputation is an important and widely used technique in the scientific community and now there is an online service provider that can offer it to the public.
I like to think that every once in a while I can make a contribution to the advancement of human civilization.
This project has now made such a contribution to humanity.
It would not be unexpected if I will now start to receive substantial recognition on the forum for the part that I have played in moving this glorious achievement to fruition. As I mentioned playing a part in a team is the way things are usually achieved in life. There were a few different skills required to move imputation over the line. No one person would likely have all that would be required for success. I suppose some positive ratings would now be in order. Perhaps a new rating could be introduced that highlighted an idea or project suggestion that not only was proposed but also implemented! The rating might be described as a world changer!
I have now received my imputed results from the link above and I am quite happy.
There are now millions and millions of imputations!
Many of these results can now be used to check my exome variant file!
As a suggestion for anyone thinking of doing an exome scan, it would be wise to insist that
the scan was coupled with a cross-referenced imputation. Many of the most interesting results form the exome scan were
of low quality. Quite a few of these low quality variants had high number of reads though for various reasons ( strand bias etc. )
were still considered suspect. With a simple and inexpensive imputation, one can now cross-reference the exome scan against
the imputation. It is not obvious to me why this would not be considered the standard of practice.
And of course taking the imputation file and running it on Promethease should now generate a massive number of new health insights.
This is all still being worked out so there are a few remaining snags yet present.
It is currently not clear whether the file is actually in a form that Promethease could read.
This will be an easy fix. There are just a few formatting issues that might be a problem.
The only one that would need to be fixed at the program level is the inclusion of NNs as genotypes in the imputation files.
These were the non-imputables. It is good that they are included, though this might throw off Promethease.
The simple workaround for them is to simply make them comments in the file.
Including imputation of mitochondrial DNA would also be welcome. However, for some reason there is a farily limited online
database of mt sequences.
Another issue that needs to be mentioned is that a fairly low threshold has been used to qualify as a call (90%).
Some have criticized the accuracy of 23andme's results based on the argument that the multiple comparison of
almost 1 million SNPs even at very high accuracy (99.9% +) will leave many errors. These errors will often be
the SNPs that are reported as having health significance. The multiple comparison problem will greatly
intensify with this imputation. The program is not providing the actual number of imputed SNPs (It would be helpful if it did.)
However, by using a line counter command on the smallest autosomal chromosome (number 22), I found that there were
almost half a million SNPs in the imputation. There appear to be many many millions of imputed SNPs. Possibly even more than 10 million. Though some of the genotypes would have a probability of being correct as low as 90%.
I am so happy this has finally happened.
It will be very helpful for the many people with gene chip files who want an estimate of the genotype of other SNPs not included on the chip.
Importantly, many of the SNPs reported in the Imputation file had an imputed genotype probability reported as 100%.
It should also be kept in mind that when the SNPs were chosen for the gene chips there was no way of knowing what SNPs would be found important for health traits. Of course there is also no guarantee that a SNP on the 23andme chip will be recognized by Promethease even when there is a 100% proxy for it. This imputation service could be a great help to many people.
Humanity you are most welcome!
Edited by mag1, 03 October 2015 - 05:50 PM.