Haplogroup R1a Explained

R1a
Map:Mapa de R1a.png
Origin-Date:22,000 to 25,000 years ago
Ancestor:Haplogroup R1
Descendants:R-M459, R-YP4141
Members:See List of R1a frequency by population

Haplogroup R1a, or haplogroup R-M420, is a human Y-chromosome DNA haplogroup which is distributed in a large region in Eurasia, extending from Scandinavia and Central Europe to Central Asia, southern Siberia and South Asia.

While one genetic study indicates that R1a originated 25,000 years ago, its subclade M417 (R1a1a1) diversified c. 5,800 years ago. The place of origin of the subclade plays a role in the debate about the origins of Proto-Indo-Europeans.

The SNP mutation R-M420 was discovered after R-M17 (R1a1a), which resulted in a reorganization of the lineage in particular establishing a new paragroup (designated R-M420*) for the relatively rare lineages which are not in the R-SRY10831.2 (R1a1) branch leading to R-M17.

Origins

R1a origins

The genetic divergence of R1a (M420) is estimated to have occurred 25,000 years ago, which is the time of the last glacial maximum. A 2014 study by Peter A. Underhill et al., using 16,244 individuals from over 126 populations from across Eurasia, concluded that there was "a compelling case for the Middle East, possibly near present-day Iran, as the geographic origin of hg R1a". The ancient DNA record has shown the first R1a during the Mesolithic in Eastern Hunter-Gatherers (from Eastern Europe, c. 13,000 years ago),[1] [2] and the earliest case of R* among Upper Paleolithic Ancient North Eurasians,[3] from which the Eastern Hunter-Gatherers predominantly derive their ancestry.[4]

Diversification of R1a1a1 (M417) and ancient migrations

According to, the downstream R1a-M417 subclade diversified into Z282 and Z93 circa 5,800 years ago "in the vicinity of Iran and Eastern Turkey". Even though R1a occurs as a Y-chromosome haplogroup among speakers of various languages such as Slavic and Indo-Iranian, the question of the origins of R1a1a is relevant to the ongoing debate concerning the German: cat=no|italic=no|urheimat of the Proto-Indo-European people, and may also be relevant to the origins of the Indus Valley civilization. R1a shows a strong correlation with Indo-European languages of Southern and Western Asia, Central and Eastern Europe and to Scandinavia being most prevalent in Eastern Europe, Central Asia, and South Asia. In Europe, Z282 is prevalent particularly while in Asia Z93 dominates. The connection between Y-DNA R-M17 and the spread of Indo-European languages was first noted by T. Zerjal and colleagues in 1999.[5]

Indo-European relation

Proposed steppe dispersal of R1a1a

proposed Ukrainian origins, and a postglacial spread of the R1a1 haplogroup during the Late Glacial Maximum, subsequently magnified by the expansion of the Kurgan culture into Europe and eastward. Spencer Wells proposes Central Asian origins, suggesting that the distribution and age of R1a1 points to an ancient migration corresponding to the spread by the Kurgan people in their expansion from the Eurasian steppe. According to, R1a1a diversified in the Eurasian Steppes or the Middle East and Caucasus region:

Three genetic studies in 2015 gave support to the Kurgan theory of Gimbutas regarding the Indo-European Urheimat. According to those studies, haplogroups R1b and R1a, now the most common in Europe (R1a is also common in South Asia) would have expanded from the Pontic–Caspian steppes, along with the Indo-European languages; they also detected an autosomal component present in modern Europeans which was not present in Neolithic Europeans, which would have been introduced with paternal lineages R1b and R1a, as well as Indo-European languages.

noted that R1a in South Asia most "likely spread from a single Central Asian source pool, there do seem to be at least three and probably more R1a founder clades within the Indian subcontinent, consistent with multiple waves of arrival." According to Martin P. Richards, co-author of, the prevalence of R1a in India was "very powerful evidence for a substantial Bronze Age migration from central Asia that most likely brought Indo-European speakers to India."[6]

Possible Yamnaya or Corded Ware origins

David Anthony considers the Yamnaya culture to be the Indo-European Urheimat. According to, a massive migration from the Yamnaya culture northwards took place c. 2,500 BCE, accounting for 75% of the genetic ancestry of the Corded Ware culture, noting that R1a and R1b may have "spread into Europe from the East after 3,000 BCE". Yet, all their seven Yamnaya samples belonged to the R1b-M269 subclade, but no R1a1a has been found in their Yamnaya samples. This raises the question where the R1a1a in the Corded Ware culture came from, if it was not from the Yamnaya culture.

According to Marc Haber, the absence of haplogroup R1a-M458 in Afghanistan does not support a Pontic-Caspian steppe origin for the R1a lineages in modern Central Asian populations.

According to Leo Klejn, the absence of haplogroup R1a in Yamnaya remains (despite its presence in Eneolithic Samara and Eastern Hunter Gatherer populations) makes it unlikely that Europeans inherited haplogroup R1a from Yamnaya.[7]

Archaeologist Barry Cunliffe has said that the absence of haplogroup R1a in Yamnaya specimens is a major weakness in Haak's proposal that R1a has a Yamnaya origin.[8]

do argue for a Yamnaya origin of R1a1a in the Corded Ware culture, noting that several publications point to the presence of R1a1 in the Comb Ware culture.

Proposed South Asian origins

Kivisild et al. (2003) have proposed either South or West Asia, while see support for both South and Central Asia. Sengupta et al. (2006) have proposed Indian origins.[9] Thanseem et al. (2006) have proposed either South or Central Asia.[10] Sahoo et al. (2006) have proposed either South or West Asia.[11] Thangaraj et al. (2010) have also proposed a South Asian origin.[12] Sharma et al.(2009) theorizes the existence of R1a in India beyond 18,000 years to possibly 44,000 years in origin.

A number of studies from 2006 to 2010 concluded that South Asian populations have the highest STR diversity within R1a1a, and subsequent older TMRCA datings. R1a1a is present among both higher (Brahmin) castes and lower castes, and while the frequency is higher among Brahmin castes, the oldest TMRCA datings of the R1a haplogroup occur in the Saharia tribe, a scheduled caste of the Bundelkhand region of Central India.

From these findings some researchers argued that R1a1a originated in South Asia, excluding a more recent, yet minor, genetic influx from Indo-European migrants in northwestern regions such as Afghanistan, Balochistan, Punjab, and Kashmir.

The conclusion that R1a originated in India has been questioned by more recent research,[13] offering proof that R1a arrived in India with multiple waves of migration.

Proposed Transcaucasia and West Asian origins and possible influence on Indus Valley Civilization

See also: Kura–Araxes culture and Uruk period.

found that part of the Yamnaya ancestry derived from the Middle East and that neolithic techniques probably arrived at the Yamnaya culture from the Balkans. The Rössen culture (4,600–4,300 BC), which was situated on Germany and predates the Corded Ware culture, an old subclade of R1a, namely L664, can still be found.

Part of the South Asian genetic ancestry derives from west Eurasian populations, and some researchers have implied that Z93 may have come to India via Iran and expanded there during the Indus Valley civilization.

proposed that the roots of Z93 lie in West Asia, and proposed that "Z93 and L342.2 expanded in a southeasterly direction from Transcaucasia into South Asia", noting that such an expansion is compatible with "the archeological records of eastward expansion of West Asian populations in the 4th millennium BCE culminating in the so-called Kura-Araxes migrations in the post-Uruk IV period." Yet, Lazaridis noted that sample I1635 of, their Armenian Kura-Araxes sample, carried Y-haplogroup R1b1-M415(xM269) (also called R1b1a1b-CTS3187).[14]

According to the diversification of Z93 and the "early urbanization within the Indus Valley ... occurred at [5,600 years ago] and the geographic distribution of R1a-M780 (Figure 3d) may reflect this." note that "striking expansions" occurred within R1a-Z93 at c. 4,500–4,000 years ago, which "predates by a few centuries the collapse of the Indus Valley Civilisation."

However, according to, steppe pastoralists are a likely source for R1a in India.

Phylogeny

The R1a family tree now has three major levels of branching, with the largest number of defined subclades within the dominant and best known branch, R1a1a (which will be found with various names such as "R1a1" in relatively recent but not the latest literature).

Topology

The topology of R1a is as follows (codes [in brackets] non-isogg codes):[15] [16] [17] Tatiana et al. (2014) "rapid diversification process of K-M526 likely occurred in Southeast Asia, with subsequent westward expansions of the ancestors of haplogroups R and Q."

R-M173 (R1)

R1a is distinguished by several unique markers, including the M420 mutation. It is a subclade of Haplogroup R-M173 (previously called R1). R1a has the sister-subclades Haplogroup R1b-M343, and the paragroup R-M173*.

R-M420 (R1a)

R1a, defined by the mutation M420, has two primary branches: R-M459 (R1a1) and R-YP4141 (R1a2).

As of 2024, there are no true, known examples of basal R1a*. When examples that were negative for M-459 were first discovered, they were initially regarded as a rare, basal paragroup, under R-M420* and defined by the mutation SRY1532.2.[19] Examples of R1a initially considered to be basal and to constitute a paragroup are now known to have been part of a fundamental forking in R1a*, i.e. R1a2 (R-YP4141). (The previously defining SNP SRY1532.2 is now regarded as unreliable.) R1a2 has two sub-branches: R1a2a (R-YP5018) and R1a2b (R-YP4132).

R-YP4141 (R1a2)

R1a2 (R-YP4141) has two branches R1a2a (R-YP5018) and R1a2b (R-YP4132).[20]

This rare primary subclade was initially regarded as part of a paragroup of R1a*, defined by SRY1532.2 (and understood to always exclude M459 and its synonyms SRY10831.2, M448, L122, and M516).[21]

YP4141 later replaced SRY1532.2 – which was found to be unreliable – and the R1a(xR-M459) group was redefined as R1a2. It is relatively unusual, though it has been tested in more than one survey. reported R-SRY1532.2* for 1/15 Himachal Pradesh Rajput samples. Underhill et al. (2009) reported 1/51 in Norway, 3/305 in Sweden, 1/57 Greek Macedonians, 1/150 (or 2/150) Iranians, 2/734 ethnic Armenians, 1/141 Kabardians, 1/121 Omanis, 1/164 in the United Arab Emirates, and 3/612 in Turkey. Testing of 7224 more males in 73 other Eurasian populations showed no sign of this category.

R-M459 (R1a1)

The major subclade R-M459 includes an overwhelming majority of individuals within R1a more broadly. However, as of 2024, all known individuals with M459 fall within R1a1a or R1a1b; no examples of R1a1* have yet been identified.

R-YP1272 (R1a1b)

R-YP1272, also known as R-M459(xM198), is an extremely rare primary subclade of R1a1. It has been found in three individuals, from Belarus, Tunisia and the Coptic community in Egypt respectively.[22]

R-M17/M198 (R1a1a)

The following SNPs are associated with R1a1a:

SNPMutationY-position (NCBI36)Y-position (GRCh37)RefSNP ID
M17INS G2019255621733168rs3908
M198C->T1354014615030752rs2020857
M512C->T1482454716315153rs17222146
M514C->T1788468819375294rs17315926
M515T->A1256462314054623rs17221601
L168A->G1471157116202177-
L449C->T2137614422966756-
L457G->A1494626616436872rs113195541
L566C->T---

R-M417 (R1a1a1)

R1a1a1 (R-M417) is the most widely found subclade, in two variations which are found respectively in Europe (R1a1a1b1 (R-Z282) ([R1a1a1a*] (R-Z282) (Underhill 2014)) and Central and South Asia (R1a1a1b2 (R-Z93) ([R1a1a2*] (R-Z93) Underhill 2014)).

R-Z282 (R1a1a1b1a) (Eastern Europe)

This large subclade appears to encompass most of the R1a1a found in Europe.

R-M458 (R1a1a1b1a1)

R-M458 is a mainly Slavic SNP, characterized by its own mutation, and was first called cluster N. Underhill et al. (2009) found it to be present in modern European populations roughly between the Rhine catchment and the Ural Mountains and traced it to "a founder effect that ... falls into the early Holocene period, 7.9±2.6 KYA." (Zhivotovsky speeds, 3x overvalued) M458 was found in one skeleton from a 14th-century grave field in Usedom, Mecklenburg-Vorpommern, Germany.[23] The paper by Underhill et al. (2009) also reports a surprisingly high frequency of M458 in some Northern Caucasian populations (18% among Ak Nogai,[24] 7.8% among Qara Nogai and 3.4% among Abazas).[25]

=R-L260 (R1a1a1b1a1a)

=R1a1a1b1a1a (R-L260), commonly referred to as West Slavic or Polish, is a subclade of the larger parent group R-M458, and was first identified as an STR cluster by . In 2010 it was verified to be a haplogroup identified by its own mutation (SNP).[26] It apparently accounts for about 8% of Polish men, making it the most common subclade in Poland. Outside of Poland it is less common. In addition to Poland, it is mainly found in the Czech Republic and Slovakia, and is considered "clearly West Slavic". The founding ancestor of R-L260 is estimated to have lived between 2000 and 3000 years ago, i.e. during the Iron Age, with significant population expansion less than 1,500 years ago.

=R-M334

=R-M334 ([R1a1a1g1],[17] a subclade of [R1a1a1g] (M458)[17] c.q. R1a1a1b1a1 (M458)[16]) was found by Underhill et al. (2009) only in one Estonian man and may define a very recently founded and small clade.

=R1a1a1b1a2b3* (Gwozdz's Cluster K)

=R1a1a1b1a2b3* (M417+, Z645+, Z283+, Z282+, Z280+, CTS1211+, CTS3402, Y33+, CTS3318+, Y2613+) (Gwozdz's Cluster K)[15] is a STR based group that is R-M17(xM458). This cluster is common in Poland but not exclusive to Poland.

=R1a1a1b1a2b3a (R-L365)

=R1a1a1b1a2b3a (R-L365)[16] was early called Cluster G.

R1a1a1b2 (R-Z93) (Asia)

Relative frequency of R-M434 to R-M17
RegionPeopleNR-M17R-M434
NumberFreq. (%)NumberFreq. (%)
PakistanBaloch60915%58%
PakistanMakrani601525% 47%
Middle EastOman121119% 32.5%
PakistanSindhi1346549% 21.5%
Table only shows positive sets from N = 3667 derived from 60 Eurasian populations sample.

This large subclade appears to encompass most of the R1a1a found in Asia, being related to Indo-European migrations (including Scythians, Indo-Aryan migrations and so on).

Geographic distribution of R1a1a

Pre-historical

In Mesolithic Europe, R1a is characteristic of Eastern Hunter-Gatherers (EHGs). A male EHG of the Veretye culture buried at Peschanitsa near Lake Lacha in Arkhangelsk Oblast, Russia c. 10,700 BCE was found to be a carrier of the paternal haplogroup R1a5-YP1301 and the maternal haplogroup U4a. A male, named PES001, from Peschanitsa in northwestern Russia was found to carry R1a5, and dates to at least 10,600 years ago. More examples include the males Minino II (V) and Minino II (I/1), with the former carrying R1a1 and the latter R1a respectively, with the former being at 10,600 years old and the latter at least 10,400 years old respectively, both from Minino in northwestern Russia.[29] A Mesolithic male from Karelia c. 8,800 BCE to 7950 BCE has been found to be carrying haplogroup R1a. A Mesolithic male buried at Deriivka c. 7000 BCE to 6700 BCE carried the paternal haplogroup R1a and the maternal U5a2a. Another male from Karelia from c. 5,500 to 5,000 BC, who was considered an EHG, carried haplogroup R1a. A male from the Comb Ceramic culture in Kudruküla c. 5,900 BCE to 3,800 BCE has been determined to be a carrier of R1a and the maternal U2e1. According to archaeologist David Anthony, the paternal R1a-Z93 was found at the Oskol river near a no longer existing kolkhoz "Alexandria", Ukraine c. 4000 BCE, "the earliest known sample to show the genetic adaptation to lactase persistence (13910-T)." R1a has been found in the Corded Ware culture, in which it is predominant. Examined males of the Bronze Age Fatyanovo culture belong entirely to R1a, specifically subclade R1a-Z93.

Haplogroup R1a has later been found in ancient fossils associated with the Urnfield culture;[30] as well as the burial of the remains of the Sintashta, Andronovo, the Pazyryk, Tagar, Tashtyk, and Srubnaya cultures, the inhabitants of ancient Tanais,[31] in the Tarim mummies, and the aristocracy of Xiongnu. The skeletal remains of a father and his two sons, from an archaeological site discovered in 2005 near Eulau (in Saxony-Anhalt, Germany) and dated to about 2600 BCE, tested positive for the Y-SNP marker SRY10831.2. The Ysearch number for the Eulau remains is 2C46S. The ancestral clade was thus present in Europe at least 4600 years ago, in association with one site of the widespread Corded Ware culture.

Europe

In Europe, the R1a1a sub-clade is primarily characterstic of Balto-Slavic populations, with two exceptions: southern Slavs and northern Russians. The highest frequency of R1a1a in Europe is observed in Sorbs (63%), a West Slavic ethnic group, followed by Hungarians (60%). Other groups with significant R1a1a, ranging from 27% to up to 58%, include Czechs, Poles, Slovenians, Slovaks, Moldovans, Belarusians, Rusyns, Ukrainians, and Russians. R1a frequency decreases in northeastern Russian populations down to 20%–30%, in contrast to central-southern Russia, where its frequency is twice as high. In the Baltics, R1a1a frequencies decrease from Lithuania (45%) to Estonia (around 30%).

There is also a significant presence in peoples of Germanic descent, with highest levels in Norway, Sweden and Iceland, where between 20 and 30% of men are in R1a1a. Vikings and Normans may have also carried the R1a1a lineage further out, accounting for at least part of the small presence in the British Isles, the Canary Islands, and Sicily. Haplogroup R1a1a averages between 10 and 30% in Germans, with a peak in Rostock at 31.3%. R1a1a is found at a very low frequency among Dutch people (3.7%) and is virtually absent in Danes.[32]

In Southern Europe R1a1a is not common, but significant levels have been found in pockets, such as in the Pas Valley in Northern Spain, areas of Venice, and Calabria in Italy. The Balkans shows wide variation between areas with significant levels of R1a1a, for example 36–39% in Slovenia,[33] 27–34% in Croatia,[34] [35] [36] [37] and over 30% in Greek Macedonia, but less than 10% in Albania, Kosovo and parts of Greece south of Olympus gorge.

R1a is virtually composed only of the Z284 subclade in Scandinavia. In Slovenia, the main subclade is Z282 (Z280 and M458), although the Z284 subclade was found in one sample of a Slovenian. There is a negligible representation of Z93 in Turkey, 12,1%[27] West Slavs and Hungarians are characterized by a high frequency of the subclade M458 and a low Z92, a subclade of Z280. Hundreds of Slovenian samples and Czechs lack the Z92 subclade of Z280, while Poles, Slovaks, Croats and Hungarians only show a very low frequency of Z92. The Balts, East Slavs, Serbs, Macedonians, Bulgarians and Romanians demonstrate a ratio Z280>M458 and a high, up to a prevailing share of Z92. Balts and East Slavs have the same subclades and similar frequencies in a more detailed phylogeny of the subclades.[38] [39] The Russian geneticist Oleg Balanovsky speculated that there is a predominance of the assimilated pre-Slavic substrate in the genetics of East and West Slavic populations, according to him the common genetic structure which contrasts East Slavs and Balts from other populations may suggest the explanation that the pre-Slavic substrate of the East and West Slavs consisted most significantly of Baltic-speakers, which at one point predated the Slavs in the cultures of the Eurasian steppe according to archaeological and toponymic references.

Asia

Central Asia

found R1a1a in 64% of a sample of the Tajiks of Tajikistan and 63% of a sample of the Kyrgyz of Kyrgyzstan.

found R1a1a-M17 in 26.0% (53/204) of a set of samples from Afghanistan, including 60% (3/5) of a sample of Nuristanis, 51.0% (25/49) of a sample of Pashtuns, 30.4% (17/56) of a sample of Tajiks, 17.6% (3/17) of a sample of Uzbeks, 6.7% (4/60) of a sample of Hazaras, and in the only sampled Turkmen individual.

found R1a1a-M198/M17 in 56.3% (49/87) of a pair of samples of Pashtuns from Afghanistan (including 20/34 or 58.8% of a sample of Pashtuns from Baghlan and 29/53 or 54.7% of a sample of Pashtuns from Kunduz), 29.1% (37/127) of a pool of samples of Uzbeks from Afghanistan (including 28/94 or 29.8% of a sample of Uzbeks from Jawzjan, 8/28 or 28.6% of a sample of Uzbeks from Sar-e Pol, and 1/5 or 20% of a sample of Uzbeks from Balkh), 27.5% (39/142) of a pool of samples of Tajiks from Afghanistan (including 22/54 or 40.7% of a sample of Tajiks from Balkh, 9/35 or 25.7% of a sample of Tajiks from Takhar, 4/16 or 25.0% of a sample of Tajiks from Samangan, and 4/37 or 10.8% of a sample of Tajiks from Badakhshan), 16.2% (12/74) of a sample of Turkmens from Jawzjan, and 9.1% (7/77) of a pair of samples of Hazara from Afghanistan (including 7/69 or 10.1% of a sample of Hazara from Bamiyan and 0/8 or 0% of a sample of Hazara from Balkh).

found R1a1-SRY10831.2 in 30.0% (12/40) of a sample of Tajiks from Tajikistan.

found R1a-M198 in 6.03% (78/1294) of a set of samples of Kazakhs from Kazakhstan. R1a-M198 was observed with greater than average frequency in the study's samples of the following Kazakh tribes: 13/41 = 31.7% of a sample of Suan, 8/29 = 27.6% of a sample of Oshaqty, 6/30 = 20.0% of a sample of Qozha, 4/29 = 13.8% of a sample of Qypshaq, 1/8 = 12.5% of a sample of Tore, 9/86 = 10.5% of a sample of Jetyru, 4/50 = 8.0% of a sample of Argyn, 1/13 = 7.7% of a sample of Shanyshqyly, 8/122 = 6.6% of a sample of Alimuly, 3/46 = 6.5% of a sample of Alban. R1a-M198 also was observed in 5/42 = 11.9% of a sample of Kazakhs of unreported tribal affiliation.

South Asia

In South Asia, R1a1a has often been observed in a number of demographic groups.

In India, high frequencies of this haplogroup is observed in West Bengal Brahmins (72%) in the east, Bhanushali (67%) and Gujarat Lohanas (60%) in the west, Uttar Pradesh Brahmins (68%), Punjab/Haryana Khatris (67%) and Ahirs (63%) in the north, and Karnataka Medars (39%) in the south. It has also been found in several South Indian Dravidian-speaking Adivasis including the Chenchu (26%) of Andhra Pradesh and Kota of Andhra Pradesh (22.58%) and the Kallar of Tamil Nadu suggesting that R1a1a is widespread in Tribal Southern Indians.

Besides these, studies show high percentages in regionally diverse groups such as Manipuris (50%) to the extreme North East and among Punjabis (47%) to the extreme North West.

In Pakistan it is found at 71% among the Mohanna tribe in Sindh province to the south and 46% among the Baltis of Gilgit-Baltistan to the north. Among the Sinhalese of Sri Lanka, 23% were found to be R1a1a (R-SRY1532) positive.[40] Hindus of Chitwan District in the Terai region Nepal show it at 69%.

East Asia

The frequency of R1a1a is comparatively low among some Turkic-speaking groups like Yakuts, yet levels are higher (19 to 28%) in certain Turkic or Mongolic-speaking groups of Northwestern China, such as the Bonan, Dongxiang, Salar, and Uyghurs.

A Chinese paper published in 2018 found R1a-Z94 in 38.5% (15/39) of a sample of Keriyalik Uyghurs from Darya Boyi / Darya Boye Village, Yutian County, Xinjiang (于田县达里雅布依乡), R1a-Z93 in 28.9% (22/76) of a sample of Dolan Uyghurs from Horiqol township, Awat County, Xinjiang (阿瓦提县乌鲁却勒镇), and R1a-Z93 in 6.3% (4/64) of a sample of Loplik Uyghurs from Karquga / Qarchugha Village, Yuli County, Xinjiang (尉犁县喀尔曲尕乡). R1a(xZ93) was observed only in one of 76 Dolan Uyghurs. Note that Darya Boyi Village is located in a remote oasis formed by the Keriya River in the Taklamakan Desert. A 2011 Y-DNA study found Y-dna R1a1 in 10% of a sample of southern Hui people from Yunnan, 1.6% of a sample of Tibetan people from Tibet (Tibet Autonomous Region), 1.6% of a sample of Xibe people from Xinjiang, 3.2% of a sample of northern Hui from Ningxia, 9.4% of a sample of Hazak (Kazakhs) from Xinjiang, and rates of 24.0%, 22.2%, 35.2%, 29.2% in 4 different samples of Uyghurs from Xinjiang, 9.1% in a sample of Mongols from Inner Mongolia. A different subclade of R1 was also found in 1.5% of a sample of northern Hui from Ningxia. in the same study there were no cases of R1a detected at all in 6 samples of Han Chinese in Yunnan, 1 sample of Han in Guangxi, 5 samples of Han in Guizhou, 2 samples of Han in Guangdong, 2 samples of Han in Fujian, 2 samples of Han in Zhejiang, 1 sample of Han in Shanghai, 1 samples of Han in Jiangxi, 2 samples of Han in Hunan, 1 sample of Han in Hubei, 2 samples of Han in Sichuan, 1 sample of Han in Chongqing, 3 samples of Han in Shandong, 5 samples of Han in Gansu, 3 samples of Han in Jilin and 2 samples of Han in Heilongjiang.[41] 40% of Salars, 45.2% of Tajiks of Xinjiang, 54.3% of Dongxiang, 60.6% of Tatars and 68.9% of Kyrgyz in Xinjiang in northwestern China tested in one sample had R1a1-M17. Bao'an (Bonan) had the most haplogroup diversity of 0.8946±0.0305 while the other ethnic minorities in northwestern China had a high haplogroup diversity like Central Asians, of 0.7602±0.0546.[42]

In Eastern Siberia, R1a1a is found among certain indigenous ethnic groups including Kamchatkans and Chukotkans, and peaking in Itel'man at 22%.

Southeast Asia

Y-haplogroups R1a-M420 and R2-M479 are found in Ede (8.3% and 4.2%) and Giarai (3.7% and 3.7%) peoples in Vietnam. The Cham additionally have haplogroups R-M17 (13.6%) and R-M124 (3.4%).

R1a1a1b2a2a (R-Z2123) and R1a1 are found in Khmer peoples from Thailand (3.4%) and Cambodia (7.2%) respectively. Haplogroup R1a1a1b2a1b (R-Y6) is also found among Kuy peoples (5%).

According to Changmai et. al (2022), these haplogroup frequencies originate from South Asians, who left a cultural and genetic legacy in Southeast Asia since the first millennium CE.[43]

West Asia

R1a1a has been found in various forms, in most parts of Western Asia, in widely varying concentrations, from almost no presence in areas such as Jordan, to much higher levels in parts of Kuwait and Iran. The Shimar (Shammar) Bedouin tribe in Kuwait show the highest frequency in the Middle East at 43%.

, noted that in the western part of the country, Iranians show low R1a1a levels, while males of eastern parts of Iran carried up to 35% R1a1a. found R1a1a in approximately 20% of Iranian males from the cities of Tehran and Isfahan. in a study of Iran, noted much higher frequencies in the south than the north.

A newer study has found 20.3% R-M17* among Kurdish samples which were taken in the Kurdistan Province in western Iran, 19% among Azerbaijanis in West Azerbaijan, 9.7% among Mazandaranis in North Iran in the province of Mazandaran, 9.4% among Gilaks in province of Gilan, 12.8% among Persian and 17.6% among Zoroastrians in Yazd, 18.2% among Persians in Isfahan, 20.3% among Persians in Khorasan, 16.7% Afro-Iranians, 18.4% Qeshmi "Gheshmi", 21.4% among Persian Bandari people in Hormozgan and 25% among the Baloch people in Sistan and Baluchestan Province.

found haplogroup R1a in 9.68% (18/186) of a set of samples from Iran, though with a large variance ranging from 0% (0/18) in a sample of Iranians from Tehran to 25% (5/20) in a sample of Iranians from Khorasan and 27% (3/11) in a sample of Iranians of unknown provenance. All Iranian R1a individuals carried the M198 and M17 mutations except one individual in a sample of Iranians from Gilan (n=27), who was reported to belong to R1a-SRY1532.2(xM198, M17).

found R1a1-SRY10831.2 in 20.8% (16/77) of a sample of Persians collected in the provinces of Khorasan and Kerman in eastern Iran, but they did not find any member of this haplogroup in a sample of 25 Kurds collected in the province of Kermanshah in western Iran.

Further to the north of these Western Asian regions on the other hand, R1a1a levels start to increase in the Caucasus, once again in an uneven way. Several populations studied have shown no sign of R1a1a, while highest levels so far discovered in the region appears to belong to speakers of the Karachay-Balkar language among whom about one quarter of men tested so far are in haplogroup R1a1a.

Historic naming of R1a

The historic naming system commonly used for R1a was inconsistent in different published sources, because it changed often; this requires some explanation.

In 2002, the Y Chromosome Consortium (YCC) proposed a new naming system for haplogroups, which has now become standard. In this system, names with the format "R1" and "R1a" are "phylogenetic" names, aimed at marking positions in a family tree. Names of SNP mutations can also be used to name clades or haplogroups. For example, as M173 is currently the defining mutation of R1, R1 is also R-M173, a "mutational" clade name. When a new branching in a tree is discovered, some phylogenetic names will change, but by definition all mutational names will remain the same.

The widely occurring haplogroup defined by mutation M17 was known by various names, such as "Eu19", as used in in the older naming systems. The 2002 YCC proposal assigned the name R1a to the haplogroup defined by mutation SRY1532.2. This included Eu19 (i.e. R-M17) as a subclade, so Eu19 was named R1a1. Note, SRY1532.2 is also known as SRY10831.2 The discovery of M420 in 2009 has caused a reassignment of these phylogenetic names.(and) R1a is now defined by the M420 mutation: in this updated tree, the subclade defined by SRY1532.2 has moved from R1a to R1a1, and Eu19 (R-M17) from R1a1 to R1a1a.

More recent updates recorded at the ISOGG reference webpage involve branches of R-M17, including one major branch, R-M417.

See also

Y-DNA backbone tree

Sources

Further reading

External links

DNA Tree
TMRCA
Various

Notes and References

  1. Saag. Lehti. Vasilyev. Sergey V.. Varul. Liivi. Kosorukova. Natalia V.. Gerasimov. Dmitri V.. Oshibkina. Svetlana V.. Griffith. Samuel J.. Solnik. Anu. Saag. Lauri. D'Atanasio. Eugenia. Metspalu. Ene. January 2021. Genetic ancestry changes in Stone to Bronze Age transition in the East European plain. Science Advances. en. 7. 4. eabd6535. 10.1126/sciadv.abd6535. 7817100. 33523926. 2021SciA....7.6535S.
  2. Haak. Wolfgang. Lazaridis. Iosif. Patterson. Nick. Rohland. Nadin. Mallick. Swapan. Llamas. Bastien. Brandt. Guido. Nordenfelt. Susanne. Harney. Eadaoin. Stewardson. Kristin. Fu. Qiaomei. February 10, 2015. Massive migration from the steppe is a source for Indo-European languages in Europe. bioRxiv. en. 013433. 10.1101/013433. 196643946. 1502.02783. February 8, 2021. December 23, 2019. https://web.archive.org/web/20191223032847/https://www.biorxiv.org/content/10.1101/013433v1. live.
  3. Raghavan. Maanasa. Skoglund. Pontus. Graf. Kelly E.. Metspalu. Mait. Albrechtsen. Anders. Moltke. Ida. Rasmussen. Simon. Stafford Jr. Thomas W.. Orlando. Ludovic. Metspalu. Ene. Karmin. Monika. January 2014. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature. en. 505. 7481. 87–91. 10.1038/nature12736. 4105016. 24256729. 2014Natur.505...87R.
  4. Narasimhan . Vagheesh M. . Patterson . Nick. Moorjani . Priya. Rohland . Nadin . Bernardos . Rebecca . Mallick . Swapan . Lazaridis . Iosif . Nakatsuka . Nathan . Olalde . Iñigo . Lipson . Mark . Kim . Alexander M.. September 6, 2019. The formation of human populations in South and Central Asia . Science . en . 365 . 6457 . eaat7487 . 10.1126/science.aat7487 . 31488661 . 6822619 . Y chromosome haplogroup types R1b or R1a not represented in Iran and Turan in this period ....
  5. Book: Zerjal, T. . et al . The use of Y-chromosomal DNA variation to investigate population history: recent male spread in Asia and Europe . S. S. . Papiha . R. . Deka . R. . Chakraborty . amp . Genomic diversity: applications in human population genetics . 1999 . 91–101 . New York . Kluwer Academic/Plenum Publishers . 978-0-3064-6295-5 . .
  6. News: Tony . Joseph . June 16, 2017 . How genetics is settling the Aryan migration debate . The Hindu . June 2, 2019 . October 4, 2023 . https://web.archive.org/web/20231004150643/https://www.thehindu.com/sci-tech/science/how-genetics-is-settling-the-aryan-migration-debate/article19090301.ece . live .
  7. Klejn . Leo S. . The Steppe Hypothesis of Indo-European Origins Remains to be Proven . Acta Archaeologica . April 22, 2017 . 88 . 1 . 193–204 . 10.1111/j.1600-0390.2017.12184.x . 0065-101X . November 23, 2022 . December 25, 2022 . https://web.archive.org/web/20221225050330/https://brill.com/view/journals/acar/88/1/article-p193_13.xml . live . "As for the Y-chromosome, it was already noted in Haak, Lazaridis et al. (2015) that the Yamnaya from Samara had Y-chromosomes which belonged to R-M269 but did not belong to the clade common in Western Europe (p. 46 of supplement). Also, not a single R1a in Yamnaya unlike Corded Ware (R1a-dominated)."
  8. Book: Koch . John T. . Cunliffe . Barry . Celtic from the West 3: Atlantic Europe in the Metal Ages . 2016 . Oxbow Books . 978-1-78570-228-0 . 634 . en . November 23, 2022 . November 23, 2022 . https://web.archive.org/web/20221123082348/https://books.google.com/books?id=Gv4sDwAAQBAJ&pg=PT634 . live .
  9. . Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA, Chow CE, Lin AA, Mitra M, Sil SK, Ramesh A, Usha Rani MV, Thakur CM, Cavalli-Sforza LL, Majumder PP, Underhill PA . 6 . Polarity and temporality of high-resolution y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of Central Asian pastoralists . American Journal of Human Genetics . 78 . 2 . 202–221 . February 2006 . 16400607 . 1380230 . 10.1086/499411. "Although considerable cultural impact on social hierarchy and language in South Asia is attributable to the arrival of nomadic Central Asian pastoralists, genetic data (mitochondrial and Y chromosomal) have yielded dramatically conflicting inferences on the genetic origins of tribes and castes of South Asia. We sought to resolve this conflict, using high-resolution data on 69 informative Y-chromosome binary markers and 10 microsatellite markers from a large set of geographically, socially, and linguistically representative ethnic groups of South Asia. We found that the influence of Central Asia on the pre-existing gene pool was minor. The ages of accumulated microsatellite variation in the majority of Indian haplogroups exceed 10,000–15,000 years, which attests to the antiquity of regional differentiation. Therefore, our data do not support models that invoke a pronounced recent genetic input from Central Asia to explain the observed genetic variation in South Asia. R1a1 and R2 haplogroups indicate demographic complexity that is inconsistent with a recent single history.ASSOCIATED MICROSATELLITE ANALYSES OF THE HIGH-FREQUENCY R1A1 HAPLOGROUP CHROMOSOMES INDICATE INDEPENDENT RECENT HISTORIES OF THE INDUS VALLEY AND THE PENINSULAR INDIAN REGION."
  10. Thanseem I, Thangaraj K, Chaubey G, Singh VK, Bhaskar LV, Reddy BM, Reddy AG, Singh L . 6 . Genetic affinities among the lower castes and tribal groups of India: inference from Y chromosome and mitochondrial DNA . BMC Genetics . 7 . 42 . August 2006 . 16893451 . 1569435 . 10.1186/1471-2156-7-42 . free .
  11. Sahoo S, Singh A, Himabindu G, Banerjee J, Sitalaximi T, Gaikwad S, Trivedi R, Endicott P, Kivisild T, Metspalu M, Villems R, Kashyap VK . 6 . A prehistory of Indian Y chromosomes: evaluating demic diffusion scenarios . Proceedings of the National Academy of Sciences of the United States of America . 103 . 4 . 843–848 . January 2006 . 16415161 . 1347984 . 10.1073/pnas.0507714103 . free . 2006PNAS..103..843S .
  12. Thangaraj K, Naidu BP, Crivellaro F, Tamang R, Upadhyay S, Sharma VK, Reddy AG, Walimbe SR, Chaubey G, Kivisild T, Singh L . 6 . The influence of natural barriers in shaping the genetic structure of Maharashtra populations . PLOS ONE . 5 . 12 . e15283 . December 2010 . 21187967 . 3004917 . 10.1371/journal.pone.0015283 . Cordaux R . free . 2010PLoSO...515283T .
  13. Book: Lalueza-Fox, C. . Inequality: A Genetic History . MIT Press . 2022 . 978-0-262-04678-7 . July 16, 2023 . 81–82 . July 16, 2023 . https://web.archive.org/web/20230716102536/https://books.google.com/books?id=xLZNEAAAQBAJ&pg=PA81 . live .
  14. Arame's English blog, Y DNA from ancient Near East
  15. Web site: About Us. Family Tree DNA. December 20, 2019. August 15, 2019. https://web.archive.org/web/20190815072247/https://www.familytreedna.com/groups/r-1a/about/results. live.
  16. Web site: ISOGG 2017 Y-DNA Haplogroup R . isogg.org . December 20, 2019 . February 10, 2007 . https://web.archive.org/web/20070210011401/https://isogg.org/tree/ISOGG_HapgrpR.html . live .
  17. Web site: Haplogroup R (Y-DNA) - SNPedia. www.snpedia.com. December 20, 2019. May 5, 2018. https://web.archive.org/web/20180505070347/https://www.snpedia.com/index.php/Haplogroup_R_(Y-DNA). live.
  18. Web site: Eurogenes Blog . March 21, 2016 . R1a in Yamnaya . December 20, 2019 . https://web.archive.org/web/20180505065717/http://eurogenes.blogspot.no/2016/03/r1a-in-yamnaya.html . May 5, 2018 . dead.
  19. Web site: . Y-DNA Haplogroup R and its Subclades . International Society of Genetic Genealogy (ISOGG) . January 8, 2011 . March 30, 2019 . https://web.archive.org/web/20190330193219/https://isogg.org/tree/ISOGG_HapgrpR.html . live .
  20. https://www.yfull.com/arch-5.07/tree/R1a/
  21. Web site: Krahn . Thomas . Draft Y-Chromosome Tree . . December 7, 2012 . https://web.archive.org/web/20130526205543/http://ytree.ftdna.com/index.php?name=Draft&parent=99812767 . May 26, 2013 . dead .
  22. https://www.yfull.com/arch-5.07/tree/R-M459/
  23. Freder . Janine . Anthropological investigation in due consideration of the ethnical background . Die mittelalterlichen Skelette von Usedom: Anthropologische Bearbeitung unter besonderer Berücksichtigung des ethnischen Hintergrundes . de . 2010 . 10.17169/refubium-8995 . 86 . Freie Universität Berlin .
  24. https://cyberleninka.ru/article/n/tyurki-kavkaza-sravnitelnyy-analiz-genofondov-po-dannym-o-y-hromosome "высокая частота R1a среди кубанских ногайцев (субветвь R1a1a1g-M458 забирает 18%"
  25. 2987245 . 2009 . Underhill . P. A. . Myres . N. M. . Rootsi . S. . Metspalu . M. . Zhivotovsky . L. A. . King . R. J. . Lin . A. A. . Chow . C. E. . Semino . O. . Battaglia . V. . Kutuev . I. . Järve . M. . Chaubey . G. . Ayub . Q. . Mohyuddin . A. . Mehdi . S. Q. . Sengupta . S. . Rogaev . E. I. . Khusnutdinova . E. K. . Pshenichnov . A. . Balanovsky . O. . Balanovska . E. . Jeran . N. . Augustin . D. H. . Baldovic . M. . Herrera . R. J. . Thangaraj . K. . Singh . V. . Singh . L. . Majumder . P. . Separating the post-Glacial coancestry of European and Asian y chromosomes within haplogroup R1a . European Journal of Human Genetics . 18 . 4 . 479–484 . 10.1038/ejhg.2009.194 . 19888303 . 1 .
  26. Web site: Peter . Gwozdz . August 6, 2018 . Polish Y-DNA Clades . July 15, 2016 . July 15, 2016 . https://web.archive.org/web/20160715055534/http://www.gwozdz.org/PolishClades.html#L260M458News . live .
  27. 8433500 . 2021 . Kars . M. E. . Başak . A. N. . Onat . O. E. . Bilguvar . K. . Choi . J. . Itan . Y. . Çağlar . C. . Palvadeau . R. . Casanova . J. L. . Cooper . D. N. . Stenson . P. D. . Yavuz . A. . Buluş . H. . Günel . M. . Friedman . J. M. . Özçelik . T. . The genetic structure of the Turkish population reveals high levels of variation and admixture . Proceedings of the National Academy of Sciences of the United States of America . 118 . 36 . e2026076118 . 10.1073/pnas.2026076118 . 34426522 . 2021PNAS..11826076K . free .
  28. 10.1537/ase.080422 . Y-haplogroup frequencies in the Slovak Romany population . 2009 . Petrejcíková . EVA . Soták . Miroslav . Bernasovská . Jarmila . Bernasovský . Ivan . Sovicová . Adriana . Bôziková . Alexandra . Boronová . Iveta . Švícková . Petra . Gabriková . Dana . MacEková . Sona . Anthropological Science . 117 . 2 . 89–94 . free .
  29. Posth . Cosimo . Yu . He . Ghalichi . Ayshin . Rougier . Hélène . Crevecoeur . Isabelle . Huang . Yilei . Ringbauer . Harald . Rohrlach . Adam B. . Nägele . Kathrin . Villalba-Mouco . Vanessa . Radzeviciute . Rita . Ferraz . Tiago . Stoessel . Alexander . Tukhbatova . Rezeda . Drucker . Dorothée G. . 2023-03-01 . Palaeogenomics of Upper Palaeolithic to Neolithic European hunter-gatherers . Nature . en . 615 . 7950 . 117–126 . 10.1038/s41586-023-05726-0 . 36859578 . 9977688 . 1476-4687. 10256/23099 . free .
  30. Web site: Lichtenstein Cave Data Analysis . Schweitzer . D. . dirkschweitzer.net . March 23, 2008 . dead . https://web.archive.org/web/20110814164431/http://dirkschweitzer.net/LichtensteinCaveAnalysis0804DS.pdf . August 14, 2011. Summary in English of .
  31. Korniyenko, I. V.; Vodolazhsky D. I. Russian: italic=no|cat=no|"Использование нерекомбинантных маркеров Y-хромосомы в исследованиях древних популяций (на примере поселения Танаис)" [The use of non-recombinant markers of the Y-chromosome in the study of ancient populations (on the example of the settlement of Tanais)]. Russian: italic=no|cat=no|Материалы Донских антропологических чтений [Materials of the Don Anthropological Readings]. Rostov-on-Don: Rostov Research Institute of Oncology, 2013.
  32. Sanchez . J . Børsting . C . Hallenberg . C . Buchard . A . Hernandez . A . Morling . N . 2003 . Multiplex PCR and minisequencing of SNPs—a model with 35 Y chromosome SNPs . Forensic Science International . 137 . 1 . 74–84 . 10.1016/S0379-0738(03)00299-8 . 14550618 . .
  33. Underhill. Peter A.. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. European Journal of Human Genetics. January 1, 2015. 23. 1. 124–131. 10.1038/ejhg.2014.50 . en. 24667786. 4266736.
  34. L. Barać. Y chromosomal heritage of Croatian population and its island isolates. European Journal of Human Genetics. 11. 7. 2003. 12825075. 10.1038/sj.ejhg.5200992. . etal. 535–42. 15822710. free.
  35. S. Rootsi. Phylogeography of Y-Chromosome Haplogroup I Reveals Distinct Domains of Prehistoric Gene Flow in Europe. American Journal of Human Genetics. 75. 1. 2004. 15162323. 1181996. 10.1086/422196. . etal. 128–137. February 13, 2021. September 5, 2020. https://web.archive.org/web/20200905162020/http://evolutsioon.ut.ee/publications/Rootsi2004.pdf. dead.
  36. M. Peričić. High-resolution phylogenetic analysis of southeastern Europe traces major episodes of paternal gene flow among Slavic populations. Molecular Biology and Evolution. 22. 10. 1964–75. 2005. 15944443. 10.1093/molbev/msi185. . etal. free.
  37. M. Peričić. Review of Croatian Genetic Heritage as Revealed by Mitochondrial DNA and Y Chromosomal Lineages. Croatian Medical Journal. 46. 4. 2005. . etal. 16100752. 502–513.
  38. Web site: Untitled . pereformat.ru . ru . May 29, 2017 . March 15, 2016 . https://web.archive.org/web/20160315183214/http://pereformat.ru/wp-content/uploads/2015/02/russian-plain-01.jpg . live .
  39. Web site: Untitled . www.rodstvo.ru . May 29, 2017 . September 16, 2021 . https://web.archive.org/web/20210916214250/https://www.rodstvo.ru/forum/index.php?act=attach&type=post&id=1299 . dead .
  40. Book: http://evolutsioon.ut.ee/publications/Kivisild2003a.pdf . Toomas Kivisild . Siiri Rootsi . Mait Metspalu . Ene Metspalu . Juri Parik . Katrin Kaldma . Esien Usanga . Sarabjit Mastana . Surinder S. Papiha . Richard Villems . The Genetics of Language and Farming Spread in India . December 20, 2019 . Examining the farming/language dispersal hypothesis . P. Bellwwood . C. Renfrew . McDonald Institute Monographs . Cambridge University . 215–222 . February 19, 2006 . https://web.archive.org/web/20060219054915/http://evolutsioon.ut.ee/publications/Kivisild2003a.pdf . dead .
  41. Zhong . Hua . Shi . Hong . Qi . Xue-Bin . Duan . Zi-Yuan . Tan . Ping-Ping . Jin. Li . Su . Bing . Ma . Runlin Z. . 2011 . Extended Y Chromosome Investigation Suggests Postglacial Migrations of Modern Humans into East Asia via the Northern Route. Molecular Biology and Evolution . 28 . 1 . 717–727 . 10.1093/molbev/msq247 . free . 20837606.
  42. Shou . Wei-Hua . Qiao. Wn-Fa . Wei . Chuan-Yu . Dong. Yong-Li . Tan . Si-Jie . Shi . Hong . Tang . Wen-Ru . Xiao . Chun-Jie . 2010. Y-chromosome distributions among populations in Northwest China identify significant contribution from Central Asian pastoralists and lesser influence of western Eurasians . Journal of Human Genetics. 55 . 5. 314–322 . 10.1038/jhg.2010.30 . 20414255 . 23002493 . free .
  43. Changmai . Piya . Jaisamut . Kitipong . Kampuansai . Jatupol . Kutanan . Wibhu . 3 . 2022 . Indian genetic heritage in Southeast Asian populations . PLOS Genetics . 18 . 2 . e1010036 . 10.1371/journal.pgen.1010036 . 8853555 . 35176016 . free.