Smith DNA Project (Surname) and Autosomal Data
Posted by smithsworldwide in Smith DNA Forum on June 4, 2014 Views:(182) Replies (0)
PostID:529 Please Login/Register to reply to this topic
We are asked sometimes about whether one can upload autosomal data to our Smith website for matching purposes. We are, at the DNA matching level, a YDNA surname project meaning that participants who wish to compare and match on the site need to have had a male Smith whose father was a Smith test, or have some level of evidence that the participant is actually a Smith (adopted, for example). We do list the FTDNA mtDNA report while suggesting that those who have tested mtDNA also join one of the mtDNA projects. Our primary focus for those who have not or cannot test YDNA is to list the line of descent for comparison in the extensive Smith tree (TNG) in hopes of either finding a paper match or encouraging an eligible Smith to do a YDNA test. We do include a form for sending your qualified autosomal matches.
Autosomal matching goes beyond a surname project in that the person wanting to look for matches is looking at every ancestor X many generations back, checking to see what sequences match at certain lengths. Because it is at a sequence DNA level typically comparing autosomal SNPs, rather than STRs (Short Tandem Repeat), which YDNA uses , along with surnames, that leads the next step for the tester to contact matches to see IF the person has any surnames or tree data in common; that link could come, for example, from a collatoral line or even be of a different surname. (One difficulty in doing this is that you rely on others having done paper trail research that is correct and extensive enough to find a link). The testers are comparing against every single person that has done autosomal testing while we limit participation to Smith and variants and in fact, only see at the administration level, the markers of those that have chosen to participate.
Here is an example of autosomal SNP results in a spreadsheet with 65535 rows of data, a few rows.
Forget a computer for a moment. Imagine that you have a room full of 65535 pieces of paper. A second person comes in with 65535 more pieces of paper, a large percentage of which are unique and different than your piece of paper. Your job is to find the papers you have that match the papers person # 2 has. Once you do find the match, you want to know IF the second person happens to be looking at the same basic surnames that you are. Suppose that person # 2 simply has not been able to go back as far as you, or has different surnames. Even though you might have a match on those papers, you may not be able to get useful information out of the fact that you do. On the other hand, let’s say that the second person is a relative of yours, an aunt, uncle, cousin, etc, or someone that you have tested that is a known or theorized relative up your paper trail. You would find some paper segments in common and then ask for the paper trail for person #2. You are then relying on the genealogy investigative abilities of the second person but you definitely already know that you DO match. In other words,one way autosomal is best is to test known relatives in your own family.
Now I want you to imagine multiplying this matching scenario by thousands of people. We’ll throw the computer in now. You still depend on the other person(s) knowing if they have a surname in common or a paper trail that is verifiable as the match might be on a collatoral line or far back enough that neither you nor the other can easily verify anything. In other words, THIS IS NOT TRIVIAL.
We recommend that those who have autosomal results compare using FamilyTreeDNA’s built in utilities. If someone has tested with a different vendor, he or she can upload autosomal results to FTDNA . FamilyTreeDNA’s utilities compare against their entire database to look for matches regardless of surname.
Let’s say that we, a volunteer project, had decided to upload autosomal DNA and compare. We would need to start uploading for ALL surnames, would need to have extensive and powerful server resources that included more memory, more disk space, more processors, and possibly linked computers. Why, frankly, do all that when FamilyTreeDNA is already doing that, has the extensive database of all surnames behind the scenes and allows for upload of autosomal from other vendors? Again, we’re a volunteer project that has chosen to look specifically at Smith, which is itself massive.
We have, at the volunteer administrator level, discussed what the best way is to accommodate those who have taken autosomal. If a person did an autosomal test but is actually eligible for a YDNA test, (ie, it’s a male Smith) we recommend that the person do that. YDNA is straightforward and easier to trace to earlier generations via a paper trail than either autosomal or mtDNA is. If the person isn’t going to do a YDNA, then we are really happy to list the direct lineage on the tree part of the site (TNG). If there is someone who definitely knows due to comparing DNA segments and paper trail with another autosomal participant due to seeing these comparisons either on FTDNA FamilyFinder matching or elsewhere, then we will put up a note on the person page for the ancestor on TNG that indicates this. In other words, the autosomal would be focused more on the tree where the match occurs with a note that a second line or branch matches in due to proofs the participant had already made to prove a lineage connection.
There is at least one website, besides the DNA vendor’s websites which look for matches, gedmatch, that does autosomal comparisons. Basically, a person registers, uploads the autosomal data, the numbers are added and crunched by the system and then the person can compare sequences with everyone in their database, which is not surname specific. The project, which is volunteer run, then allows you to do various genealogy and triangulations functions to see how your SNP sequences may compare with others. The resources to run such a system are heavy, as one would need to have servers with a lot of storage (for collecting the information), robust database(s) with a lot of storage) for collecting the information), lots of server memory and probably multiple processors on update on fast computer(s). In fact, when I looked yesterday, I saw this.
I believe this is not the first time they have been down, but they are also a free resource which is spending their own time and money on as volunteers, and I personally believe that if there is a huge influx of people uploading results, they are or will have to invest a lot of money in infrastructure to keep it going. That’s their decision, and I am all for free projects that help others make sense of hobbies; heck, Smith is one for the purpose of comparing YDNA. What I am saying is that it’s not a trivial, inexpensive, or minimal time-taking hobby. We have chosen to limit the DNA matching resources to YDNA for those are Smith, Schmidt, Smyth, Smithson, Smithall, Smithers, etc.
To summarize, while we do not collect or upload autosomal data and instead recommend that members use the FamilyTreeDNA matching utilities for FamilyFinder, we do want you to send in a direct line tree from your earliest Smith that you have a source for down to you. We will not put the most current generations after 1900 or do so with your permission marking these people private. You can either send a gedcom via email or you may use this form.