I assume you're from the SNPedia site? Great example of the benefits of crowdsourcing! Very much appreciated.
If you're the developer of Promethease, could you explain how the personal data is compared with the info on the site? I assume the general SNP information is downloaded by Promethease from SNPedia and no personal data is transmitted? Would it be possible to give more information about security?
I'm the programmer who wrote Promethease, as well as being the guy who setup and maintains the SNPedia webserver. Both exist to help me make sense of my own DNA.
Promethease reads the file formats of the various companies, and checks to see which SNPs are known in SNPedia. It then builds a single .html file report based on what it learns. If your 23andMe file says that you are
rs4680 AA
then Promethease needs to read the page
http://www.snpedia.c...php/Rs4680(A;A)from SNPedia, just like you would if you were doing this by hand. As a result the webserver logs have a record that you looked at rs4680(A;A). This then implies that you are probably rs4680(A;A) and not rs4680(A;G) or rs4680(G;G). So while no personal information is transmitted per-se, the specific requests you make do 'leak' your genotypes.
As version 0.1.50+ there is the ability to pay $2 and speed things up. This works by downloading a single file cache containing a compressed version of much of the information in SNPedia. Since its a single request and compressed it runs faster, but as a side benefit there is no need to request the pages for your genotypes. This greatly improves your privacy, and leaves SNPedia with no way to know what genotypes you have. However since the cache is not yet perfectly up to date, nor fully comprehensive, during even paid Promethease runs most users will need to request a *few* genotypes directly from SNPedia, which leaks some information.Even for a perfect cache which doesn't need to read any pages, most users will eventually click through with their webbrowser to look at the full details of some of their genotypes. Again this leaks information into the server logs.
In time I hope to ensure that the cache is fully comprehensive (mainly for performance and freshness reasons). I'd also like to scrub the ip/genotypes from my logs. But I'm a strong believer in release early, release often and Promethease wouldn't exist at all if I had to wait until every case was covered.
This has always been documented at
http://www.snpedia.c...ethease/privacywhich is linked to from
http://www.snpedia.c...php/Prometheasewhere you downloaded Promethease. The recent performance enhancements justify making a few additions to that page, which I'll do shortly. If you have further questions and want to ensure they get addressed, its better to leave them there.
I don't blame you for being security conscious. I am as well, which is why that document has existed since May of 2008
http://www.snpedia.c...;action=historyBut I've been amazed at how many people are comfortable sharing their information
http://www.snpedia.c...dex.php/Genomesand am grateful for the valuable test data they have provided.
If you are fond of tinfoil toques Promethease may not yet be for you. If you can't wait, I'd suggest doctoring a few 23andMe files, and running promethease a few times, from various coffee shops in foreign countries. You alone will know which file was real, and the rest will serve as chaff
http://en.wikipedia....g_and_winnowing