Does smoking make a man less of a man?

chromosome

Editor's Introduction

Smoking is associated with mosaic loss of chromosome Y

annotated by
Robin Williamson

The genetics of some diseases (e.g., cystic fibrosis, Huntington's disease) is pretty clear cut: A person is born with a disruption, or mutation, in a critical gene that causes the disease. The genetics of many common diseases, however, are more complicated. In most cases, diseases like cancer, inflammatory bowel disease, or hypertension cannot be explained by mutations in a single gene. Recent work has suggested that, instead of mutations in a single gene, people may be born with variants in many genes that may increase the risk of developing some of these diseases. Although exciting findings have identified such variants that contribute to disease risk, our current understanding of them still does not fully explain how these diseases develop. As a piece to this puzzle, there is increasing evidence that changes that occur in a person's DNA during life may impact disease risk. Such changes are often referred to as acquired changes because they occur after cells and tissues are already established.  Because each established cell may be affected differently, a mosaic pattern may result in which some cells have the change and some cells do not. In addition, there is increasing evidence that environmental factors also impact disease risk. Here, the authors report that the cells in men who smoke lose their Y chromosome significantly more than cells in men who do not smoke. This loss of the Y chromosome has been associated previously with the development of cancers.  

Paper Details

Original title
Smoking is associated with mosaic loss of chromosome Y
Authors
Jan Dumanski
Original publication date
Reference
Vol. 347 no. 6217 pp. 81-83
Issue name
Science
DOI
10.1126/science.1262092

Abstract

Tobacco smoking is a risk factor for numerous disorders, including cancers affecting organs outside the respiratory tract. Epidemiological data suggest that smoking is a greater risk factor for these cancers in males compared with females. This observation, together with the fact that males have a higher incidence of and mortality from most non–sex-specific cancers, remains unexplained. Loss of chromosome Y (LOY) in blood cells is associated with increased risk of nonhematological tumors. We demonstrate here that smoking is associated with LOY in blood cells in three independent cohorts [TwinGene: odds ratio (OR) = 4.3, 95% confidence interval (CI) = 2.8 to 6.7; Uppsala Longitudinal Study of Adult Men: OR = 2.4, 95% CI = 1.6 to 3.6; and Prospective Investigation of the Vasculature in Uppsala Seniors: OR = 3.5, 95% CI = 1.4 to 8.4] encompassing a total of 6014 men. The data also suggest that smoking has a transient and dose-dependent mutagenic effect on LOY status. The finding that smoking induces LOY thus links a preventable risk factor with the most common acquired human mutation.

Report

Tobacco smoking killed ~100 million people during the 20th century and is projected to kill ~1 billion people during the current century, assuming that the current frequency of smoking is retained (1, 2). Lung cancer is the prime cause of cancer-associated death in relation to smoking. However, smoking is also a risk factor for tumors outside the respiratory tract, and these are more common in males than females [hazard ratio in males: 2.2, 95% confidence interval (CI) = 1.7 to 2.8; in females: 1.7, 95% CI = 1.4–2.1] (2). Moreover, males have a higher incidence and mortality from most non–sex-specific cancers, disregarding smoking status, and this fact is largely unexplained by known risk factors (3, 4). A recent analysis of noncancerous blood cells revealed that a male-specific chromosomal aberration, acquired mosaic loss of chromosome Y (LOY), is associated with an increased risk of nonhematological tumors among aging males (5).

Here, we analyzed possible causes of LOY by studying 6014 men from three independent prospective cohorts—TwinGene, n = 4373 (6, 7); Uppsala Longitudinal Study of Adult Men (ULSAM), n = 1153 (8); and Prospective Investigation of the Vasculature in Uppsala Seniors (PIVUS), n = 488 (9)—from which comprehensive epidemiological records are available (tables S2 to S4). We included the following environmental, lifestyle, and clinical factors in the analyses: smoking, age, hypertension, exercise habits, diabetes, body mass index, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, education level, and alcohol intake. We also included genotyping quality as a confounder in the regression analyses, to adjust for possible influence of experimental noise. Similar definitions of factors were used in all cohorts, as outlined in tables S2 to S5 and described in detail in the materials and methods section of the supplementary materials. Estimation of LOY was based on single-nucleotide polymorphism (SNP)–array data using the 2.5MHumanOmni and HumanOmniExpress beadchips in the ULSAM and PIVUS/TwinGene studies, respectively (fig. S1). The estimation of the degree of mosaicism and scoring of LOY was undertaken using the continuous median logR ratio (mLRR-Y) estimate, calculated from SNP-array data as the median of the logR ratio of all SNP probes within the male-specific part of chromosome Y (MSY), as described previously (5). An mLRR-Y estimate close to zero indicates a normal chromosome Y state, whereas more negative mLRR-Y values denote an increasing level of blood cells with LOY. To facilitate comparisons between the three cohorts, we corrected the mLRR-Y values for all participants, using cohort-specific correction constants, as explained in the supplementary materials (figs. S1 and S2).

LOY was by far the most common postzygotic mutation found in the three cohorts. The age range at sampling in ULSAM and PIVUS was 70.7 to 83.6 years and 69.8 to 70.7 years, respectively, and we found LOY in 12.6% of ULSAM participants and 15.6% of PIVUS participants (figs. S3 and S4). The age range at sampling in TwinGene was 48 to 93 years, and the frequency of LOY in the entire cohort was 7.5% (fig. S5). However, in TwinGene participants aged 70 years or older, 15.4% had LOY, which is similar to the LOY frequency in the other cohorts in the same age range. In TwinGene participants younger than 70 years, only 4.1% were affected by LOY. Thresholds for LOY scoring, at the lower 99% confidence limit of the distributions of experimental mLRR-Y variation (TwinGene = –0.1324, ULSAM = –0.1024, and PIVUS = –0.1182), were used for frequency calculations, as explained in the supplementary materials and figs. S3 to S5. At this degree of mosaicism, ~10% of the analyzed nucleated blood cells from each sample are affected by LOY. Mosaic LOY in blood detected with SNP arrays was validated using whole-genome next-generation sequencing (NGS) in 100 random participants in the ULSAM cohort. There was 100% concordance in LOY scoring between results from SNP array and NGS data (5).

We found a strong association between smoking and LOY status in the three independent cohorts. Current smokers had a significantly higher degree of LOY mosaicism, compared with nonsmokers and past smokers, in unadjusted models (Fig. 1) and in the multivariable models adjusting for the above-mentioned potential confounders (table S1, analysis of covariance: TwinGene F(1,1666) = 45.4, P = 2.225 × 10–11; ULSAM F(1,968) = 17.6, P = 2.984 × 10–5; PIVUS F(1,385) = 9.1, P = 0.0028). Apart from smoking, the only other factor significantly associated with LOY was age, which was observed in TwinGene, with higher degree of LOY in older participants. The age range in ULSAM and PIVUS was narrower (see above), which may explain why age had no effect on LOY in these regression models (tables S1 and S5). To assess the risk of LOY in blood cells of smokers, we used logistic regression adjusting for the same confounders as in table S1. Participants were LOY-scored as 1 or 0 before these analyses based on the continuous mLRR-Y estimate, using the same threshold as for the estimations of LOY frequencies—i.e., the lower 99% CIs of the experimentally induced mLRR-Y variation (figs. S3 to S5). The adjusted odds ratio (OR) estimates from logistic regressions were highest in TwinGene (OR = 4.3, 95% CI = 2.8 to 6.7), followed by PIVUS (OR = 3.5, 95% CI = 1.4 to 8.4) and ULSAM (OR = 2.4, 95% CI = 1.6 to 3.6). The corresponding unadjusted ORs are given in table S6. Based on these calculations, we estimate that current smokers in the studied cohorts had a 2.4 to 4.3 times greater risk of displaying LOY compared with nonsmokers. Furthermore, among the current smokers in the large TwinGene cohort, we found a strong dose-response effect with more LOY in heavy smokers—i.e., smokers with LOY had been smoking significantly more pack-years compared to smokers without LOY (Fig. 2C).

cey1.jpg

Fig. 1.  The association between smoking status and the level of LOY (i.e., mLRR-Y) in three independent cohorts.  In all cohorts, these unadjusted analyses indicate that the current smokers (Yes) (table S5) had a significantly higher degree of mosaic LOY in blood, compared with noncurrent smokers (No), composed of never-smokers and previous smokers. ***P < 0.001; *P < 0.05 (Kolmogorov-Smirnov tests: TwinGene, D = 0.15, P = 1.131× 10–11; ULSAM, D = 0.15, P = 0.0006; PIVUS, D = 0.23,P = 0.0203). The definitions used for LOY scoring and the entire ranges of mLRR-Y data observed in each cohort are shown in figs. S3 to S5.

Experimental Question

Is smoking associated with a man’s degree of mosaic LOY?

This was assessed experimentally by asking …

How does the degree of mosaic LOY, as measured by mLRR-Y, in current smokers (“Yes” group) compare with the degree of mosaic LOY in never-smokers and previous smokers (“No” group) in each of the three data sets (TwinGene, ULSAM, and PIVUS)?

Methods

The degree of mosaic LOY was scored with the continuous median logR ratio (LRR) of SNP array data collected from markers on the Y chromosome (mLRR-Y).

LRR = log2(observed R/expected R); mLRR = the median of the LRRs measured in each individual in the study.

If mLRR = 0, the observed signal is the same as the expected signal.

If mLRR < 0, the observed signal is less than the expected signal, suggesting the DNA tested by that probe is found in fewer copies than expected.

If mLRR > 0, the observed signal is greater than the expected signal, suggesting the DNA tested by that probe is found in more copies than expected.

A box plot was used to represent the data. In a box plot, the bottom edge of the box is set at the value of the 1st quartile of data; the horizontal line across the middle of the box is set at the value of the 2nd quartile, or median; and the top edge of the box is set at the value of the 3rd quartile. (For example, if 17 data points were graphed in order from smallest to largest, the 1st quartile is the value of the 5th data point, the median is the value of the 9th data point, and the 3rd quartile is the value of the 13th data point.) The difference between the value of the 3rd quartile and the value of the 1st quartile is called the interquartile range (ICR). In this box plot, the bottom horizontal line, or “whisker,” was set at the smallest data point that was within 1.5 x ICR less than the value of the 1st quartile, and the top “whisker” was set at the largest data point that was within 1.5 x ICR more than the value of the 3rd quartile.

For more information about box plots: https://www.khanacademy.org/math/probability/descriptive-statistics/box-and-whisker-plots/v/reading-box-and-whisker-plots

To determine whether the degree of mosaic LOY in the “Yes” category was significantly different from that in the “No” category, the authors used the Kolmogorov-Smirnov test. This test can determine how different two data sets are from each other and whether the difference is statistically significant. The magnitude of the difference is reported as a D value, and the significance of the difference is reported as a p-value.

For more information about the Kolmogorov-Smirnov test: http://www.physics.csbsju.edu/stats/KS-test.html

Traditionally, the result of a statistical comparison is deemed statistically significant if the probability of the difference happening by chance is less than 5%. This probability is indicated by p-values. If a p-value for a statistical comparison is less than 0.05 (or 5%), the difference is said to be statistically significant.

For more information about p-values: http://www.statsdirect.com/help/default.htm#basics/p_values.htm

Although the LOY data from the three data sets (TwinGene, ULSAM, and PIVUS) could have been combined, analyses were done on each set separately because differences are considered more robust and universal if they are detected in several data sets independently.

Results 

In each of the three data sets, the degree of mosaic LOY as measured by mLRR-Y was significantly lower in the “No” category (i.e., previous smokers and never-smokers) than in the “Yes” category (i.e., current smokers). As indicated by the asterisks and the figure legend, the magnitude and significance of this difference was D = 0.15 and P = 1.131 x 10-11 in the TwinGene data set, D = 0.15 and P = 0.0006 in the ULSAM dataset, and D = 0.23 and P = 0.0203 in the PIVUS data set.

See the Methods tab for additional details about the statistical methods used.

Conclusions

In these three data sets, smoking was significantly associated with mosaic LOY. 

Our results suggest that the association between smoking and LOY is valid for current smokers only (Fig. 2, A and B). Previous epidemiological studies showed that smoking cessation at any age is associated with dramatically reduced death rates. For smokers who quit at 25 to 34 years of age, survival was nearly identical with those who had never smoked (210). We analyzed the level of LOY after smoking cessation in the ULSAM and the TwinGene cohorts by using LOY data in past regular smokers (table S5)No difference in LOY frequency between never-smokers and previously regular smokers was found (Fig. 2, A and B). One possible explanation for these results is that the previous smokers with LOY died off faster than the rest of the cohort. Another, and perhaps more likely, explanation is that LOY is induced and sustained by smoking and that LOY is a dynamic and reversible process.

cey2.jpg

Fig. 2.  Differences in degree of LOY between different smoking categories within TwinGene (A) and ULSAM (B), as defined in table S5.  In TwinGene (A), there was a significant difference between four smoking categories [analysis of variance (ANOVA): F(3,4137) = 22.2, P = 3.028 × 10–14], and results from post hoc analysis adjusting P values for multiple testing using a Tukey post hoc test are displayed. In ULSAM (B), there was also a significant difference between three smoking categories (ANOVA; F(2,1107)= 12.2, P = 5.812 × 10–6), and post hoc analysis is shown. In both cohorts, the current smokers had a significantly higher degree of LOY compared with all other categories. The average degree of LOY in the previously regular smokers was not significantly different from the average degree of LOY in the never-smokers in both cohorts. (C) A dose-response effect within current smokers in TwinGene, with men smoking the most pack-years also being associated with higher degree of LOY, as defined in fig. S5 (Kolmogorov-Smirnov test: D = 0.2244, P = 0.0010).

Panels A and B: Question

Is having any history of smoking associated with the same degree of mosaic LOY as currently smoking?

This was assessed experimentally by asking, "How does the degree of mosaic LOY, as measured by mLRR-Y, in current smokers compare with the degree of mosaic LOY in never-smokers, party smokers, and previously regular smokers in the TwinGene and ULSAM data sets?"

Panels A and B: Methods 

The degree of mosaic LOY was scored with the continuous median logR ratio (LRR) of SNP array data collected from markers on the Y chromosome (mLRR-Y).

LRR = log2 (observed R/expected R); mLRR = the median of the LRRs measured in each individual in the study

If mLRR = 0, the observed signal is the same as the expected signal.

If mLRR < 0, the observed signal is less than the expected signal, suggesting the DNA tested by that probe is found in fewer copies than expected.

If mLRR > 0, the observed signal is greater than the expected signal, suggesting the DNA tested by that probe is found in more copies than expected.

A box plot was used to represent the data. In a box plot, the bottom edge of the box is set at the value of the 1st quartile of data; the horizontal line across the middle of the box is set at the value of the 2nd quartile, or median; and the top edge of the box is set at the value of the 3rd quartile. (For example, if 17 data points were graphed in order from smallest to largest, the 1st quartile is the value of the 5th data point, the median is the value of the 9th data point, and the 3rd quartile is the value of the 13th data point.) The difference between the value of the 3rd quartile and the value of the 1st quartile is called the interquartile range (ICR). In this box plot, the bottom horizontal line, or “whisker,” was set at the smallest data point that was within 1.5 x ICR less than the value of the 1st quartile, and the top “whisker” was set at the largest data point that was within 1.5 x ICR more than the value of the 3rd quartile.

To determine whether there were any significant differences in the degree of mosaic LOY between the different smoking categories, the authors used a statistical approach called ANOVA, or ANalysis Of VAriance. Specifically, they used an F-test. The results of an F-test are reported in the format of F(x,y) = z, where x is the between-group degrees of freedom, y is the within-group degrees of freedom, and z is the between-group mean square value/the within-group mean square value. If z is greater than a set value that is designated as significant for a given x and y, then there is a significant difference between groups.

Traditionally, the result of a statistical comparison is deemed statistically significant if the probability of the difference happening by chance is less than 5%. This probability is indicated by p-values. If a p-value for a statistical comparison is less than 0.05 (or 5%), the difference is said to be statistically significant.

Follow-up, or post hoc, tests were then done to individually compare each of the smoking categories with each of the others to determine whether the degree of mosaic LOY in one group was significantly different from that in the other groups. (The F-test indicated only whether there was any difference between the different categories but didn’t specifically indicate which category was different from the others.) Because of the confounding that can be introduced by performing multiple statistical tests (see glossary), a Tukey post hoc test was performed to adjust the p-values for multiple testing.  

Panels A and B: Results

As mentioned in the figure legend, the ANOVA analysis of all four smoking categories at the same time indicated that there was a significant difference in the degree of mosaic LOY, as measured by mLRR-Y, between smoking categories in each of the data sets (TwinGene: F(3,4137) = 22.2, P = 3.028 x 10-14; ULSAM: F(2,1107) = 22.2, P = 5.812 x 10-6).

At the bottom of each panel, the results from the post hoc analysis comparing each of the smoking categories to the others are reported.

For the TwinGene data set, the degree of mosaic LOY in never-smokers (“Group 1”), party smokers (“Group 2”), and previously regular smokers (“Group 3”) was significantly different than that of current smokers (“Group 4”):

Group 1 versus Group 4: P < 0.0001

Group 2 versus Group 4: P < 0.0001

Group 3 versus Group 4: P < 0.0001

For the ULSAM data set, the degree of LOY in never-smokers (“Group 0”) and previously regular smokers (“Group 1”) was significantly different than that of current smokers (“Group 2”):

Group 0 versus Group 2: P < 0.0001

Group 1 versus Group 2: P = 0.0011

Panel C: Question

Is a greater amount of smoking associated with a higher degree of mosaic LOY?

This was assessed experimentally by asking …

Within the current smokers category, how does the amount of smoking done by those with mosaic LOY compare with the amount of smoking done by those without mosaic LOY?

Panel C: Methods

The current smokers in the TwinGene data set were divided into two groups, those with mosaic LOY and those without mosaic LOY. The amount each of these groups smoked as measured by pack-years was then compared.

An individual was classified as “with” mosaic LOY if his mLRR-Y was lower than the low boundary of the 99% confidence interval for the mLRR-Y values of the entire dataset.

A pack-year is defined as the number of packs (20 cigarettes) packed per day multiplied by the number of years a person has smoked. For example:

·A person who has smoked 2 packs per day for 20 years has a pack-year value of 40

·A person who has smoked ½ pack per day for 20 years has a pack-year value of 10

A box plot was used to represent the data. In a box plot, the bottom edge of the box is set at the value of the 1st quartile of data, the horizontal line across the middle of the box is set at the value of the 2nd quartile, or median, and the top edge of the box is set at the value of the 3rd quartile. (For example, if 17 data points were graphed in order from smallest to largest, the 1st quartile is the value of the 5th data point, the median is the value of the 9th data point, and the 3rd quartile is the value of the 13th data point.) The difference between the value of the 3rd quartile and the value of the 1st quartile is called the interquartile range (ICR). In this box plot, the bottom horizontal line, or “whisker,” was set at the smallest data point that was within 1.5 x ICR less than the value of the 1st quartile, and the top “whisker” was set at the largest data point that was within 1.5 X ICR more than the value of the 3rd quartile.

To determine whether the degree of mosaic LOY in the “Yes” category was significantly different from that in the “No” category, the authors used the Kolmogorov-Smirnov test. This test can determine how different two datasets are from each other and whether the difference is statistically significant. The magnitude of the difference is reported as a D value, and the significance of the difference is reported as a p-value.

For more information about the Kolmogorov-Smirnov test: http://www.physics.csbsju.edu/stats/KS-test.html

Traditionally, the result of a statistical comparison is deemed statistically significant if the probability of the difference happening by chance is less than 5%. This probability is indicated by p-values. If a p-value for a statistical comparison is less than 0.05 (or 5%), the difference is said to be statistically significant.

Panel C: Results 

In the TwinGene data set, smokers designated as “with” mosaic LOY had a significantly higher median number of pack-years than smokers designated as “without” mosaic LOY. As indicated in the figure legend, the magnitude of this difference was D = 0.2244, and the significance of this difference was P = 0.001.

See the Methods tab for additional details about the statistical methods used. 

Conclusions

In the TwinGene and ULSAM data sets, smoking was significantly associated with mosaic LOY; however, there was no association in men who had quit smoking. 

In the TwinGene data set, men with mosaic LOY smoked significantly more than those without mosaic LOY. 

Whether the LOY induced by smoking plays a direct role in cancer is unclear. One hypothesis is that smoking is clastogenic—that is, it induces many chromosomal abnormalities, including an incidental loss of the Y chromosome. In this scenario, LOY would be a neutral passenger mutation and a reporter of a general tendency of chromosome missegregation in mitosis, which is enhanced by smoking and associated with risk for cancer and mortality. A second hypothesis is that LOY in blood cells is a causative factor in cancer development, possibly through effects on tumor immunosurveillance (11). We conducted a preliminary test of this hypothesis by investigating possible functional consequences of LOY in sorted blood cells from three ULSAM survivors scored with LOY and still alive at the age of 91 years. We sorted cells from three compartments (granulocytes, CD4+ T-lymphocytes, and CD19+ B-lymphocytes) in subjects that displayed LOY in earlier serial analyses of whole blood, performed at four time points during two decades (fig. S6). The three subjects were free from cancer diagnoses at the time of blood collection at 91 years. The data from these experiments are tantalizing because (i) the percentage of cells with LOY differed between different compartments of the hematopoietic system, and (ii) ULSAM-1412 suggests that LOY might be an oligo-clonal process, because cells derived from myeloid and some (but not all) lymphoid progenitors display LOY. These preliminary results support the second hypothesis. If LOY were a phenotypically neutral passenger mutation, one would expect that LOY cells would be randomly distributed within all components of the hematological system. It was recently shown that LOY status of blood cells is associated with a higher risk for all-cause mortality as well as a higher risk for nonhematological cancers and that it can be considered as a biomarker of male carcinogenesis (5). We hypothesized that a disrupted tumor-immunosurveillance in LOY-affected cells could help explain the connection between LOY status of noncancerous blood cells and risk for tumors in other tissues (5, 11). These results also support the second hypothesis and the increasingly recognized view that chromosome Y carries many vital functions in biological processes beyond sex determination and sperm production (1216).

Our results are consistent with a previously described dynamic nature of expanding-contracting noncancerous cell clones in blood affected with mosaic genetic aberrations—i.e., it appears that the relative frequency of cells from a cell clone can first increase and then decrease later in life (5, 17, 18). In the present analyses, LOY was detected in ≥10% of blood cells from about 15% of elderly males in three cohorts (figs. S3 to S5). The cell clones with LOY were likely detectable in our analyses because they are enriched due to an increased proliferative potential as a consequence of LOY, which is in agreement with chromosome Y containing tumor suppressor genes. Recent analysis of >8200 tumor-normal pairs suggest that two genes (ZFY and UTY, from the male-specific part of Y) have properties of tumor suppressors (19). Interestingly, both genes have homologs on chromosome X and escape X inactivation (19, 20). Moreover, other analyses of various tumor collections show that chromosome Y is lost from numerous types of tumors in frequencies ranging from 15 to 80% of cases (2124). Thus, counting both LOY in noncancerous blood clones and in transformed tumor cells, nullisomy Y is among the most common, if not the most common, human mutation. The results presented here suggest that this aneuploidy, affecting 1.6% of the genome, is likely induced by smoking.

In conclusion, we show that LOY is more common in current smokers compared with noncurrent smokers in three cohorts (Fig. 1 and table S1), that the effect from smoking on LOY is dose dependent, and that this effect appears to be transient, as it disappears after smoking cessation (Fig. 2). Epidemiological observations suggest that smoking could be a greater risk factor for cancer outside the respiratory tract in males compared with females (2, 4, 10). Moreover, males have a higher incidence and mortality from most sex-unspecific cancers (3, 4). The molecular mechanisms behind these observations are not well understood, but LOY, being a male-specific, smoking-induced risk factor, could provide a missing link and help explain these sex differences.

Supplementary Materials

www.sciencemag.org/content/347/6217/81/suppl/DC1

Materials and Methods

Figs. S1 to S6

Tables S1 to S6

References (2527)

References and Notes

  1. P. Jha, Avoidable global cancer deaths and total deaths from smoking. Natl. Rev. 9, 655–664 (2009).

  2. P. Jha, C. Ramasundarahettige, V. Landsman, B. Rostron, M. Thun, R. N. Anderson, T. McAfee, R. Peto, 21st-century hazards of smoking and benefits of cessation in the United States. N. Engl. J. Med. 368, 341–350 (2013).

  3. M. B. Cook, K. A. McGlynn, S. S. Devesa, N. D. Freedman, W. F. Anderson, Sex disparities in cancer mortality and survival. Cancer Epidemiol. Biomarkers Prev. 20, 1629–1637 (2011).

  4. G. Edgren, L. Liang, H. O. Adami, E. T. Chang, Enigmatic sex disparities in cancer incidence. Eur. J. Epidemiol. 27, 187–196 (2012).

  5. L. A. Forsberg, C. Rasi, N. Malmqvist, H. Davies, S. Pasupulati, G. Pakalapati, J. Sandgren, T. Diaz de Ståhl, A. Zaghlool, V. Giedraitis, L. Lannfelt, J. Score, N. C. Cross, D. Absher, E. T. Janson, C. M. Lindgren, A. P. Morris, E. Ingelsson, L. Lind, J. P. Dumanski, Mosaic loss of chromosome Y in peripheral blood is associated with shorter survival and higher risk of cancer. Nat. Genet. 46, 624–628 (2014).

  6. P. Lichtenstein, P. F. Sullivan, S. Cnattingius, M. Gatz, S. Johansson, E. Carlström, C. Björk, M. Svartengren, A. Wolk, L. Klareskog, U. de Faire, M. Schalling, J. Palmgren, N. L. Pedersen, The Swedish Twin Registry in the third millennium: An update. Twin Res. Hum. Genet. 9, 875–882 (2006).

  7. P. K. Magnusson, C. Almqvist, I. Rahman, A. Ganna, A. Viktorin, H. Walum, L. Halldner, S. Lundström, F. Ullén, N. Långström, H. Larsson, A. Nyman, C. H. Gumpert, M. Råstam, H. Anckarsäter, S. Cnattingius, M. Johannesson, E. Ingelsson, L. Klareskog, U. de Faire, N. L. Pedersen, P. Lichtenstein, The Swedish Twin Registry: Establishment of a biobank and other recent developments. Twin Res. Hum. Genet. 16, 317–329 (2013).

  8. H. Hedstrand, A study of middle-aged men with particular reference to risk factors for cardiovascular disease. Ups. J. Med. Sci. Suppl. 19, 1–61 (1975).

  9. L. Lind, N. Fors, J. Hall, K. Marttala, A. Stenborg, A comparison of three different methods to evaluate endothelium-dependent vasodilation in the elderly: The Prospective Investigation of the Vasculature in Uppsala Seniors (PIVUS) study. Arterioscler. Thromb. Vasc. Biol. 25, 2368–2375 (2005).

  10. M. J. Thun, B. D. Carter, D. Feskanich, N. D. Freedman, R. Prentice, A. D. Lopez, P. Hartge, S. M. Gapstur, 50-year trends in smoking-related mortality in the United States. N. Engl. J. Med. 368, 351–364 (2013).

  11. G. P. Dunn, A. T. Bruce, H. Ikeda, L. J. Old, R. D. Schreiber, Cancer immunoediting: From immunosurveillance to tumor escape. Nat. Immunol. 3, 991–998 (2002).

  12. B. Lemos, L. O. Araripe, D. L. Hartl, Polymorphic Y chromosomes harbor cryptic variation with manifold functional consequences. Science 319, 91–93 (2008).

  13. B. Lemos, A. T. Branco, D. L. Hartl, Epigenetic effects of polymorphic Y chromosomes modulate chromatin components, immune response, and sexual conflict. Proc. Natl. Acad. Sci. U.S.A. 107, 15826–15831 (2010).

  14. G. Clark, Genetics: The vital Y chromosome. Nature 508, 463–465 (2014).

  15. D. W. Bellott, J. F. Hughes, H. Skaletsky, L. G. Brown, T. Pyntikova, T. J. Cho, N. Koutseva, S. Zaghlul, T. Graves, S. Rock, C. Kremitzki, R. S. Fulton, S. Dugan, Y. Ding, D. Morton, Z. Khan, L. Lewis, C. Buhay, Q. Wang, J. Watt, M. Holder, S. Lee, L. Nazareth, J. Alföldi, S. Rozen, D. M. Muzny, W. C. Warren, R. A. Gibbs, R. K. Wilson, D. C. Page, Mammalian Y chromosomes retain widely expressed dosage-sensitive regulators. Nature 508, 494–499 (2014).

  16. D. Cortez, R. Marin, D. Toledo-Flores, L. Froidevaux, A. Liechti, P. D. Waters, F. Grützner, H. Kaessmann, Origins and functional evolution of Y chromosomes across mammals. Nature 508, 488–493 (2014).

  17. L. A. Forsberg, C. Rasi, H. R. Razzaghian, G. Pakalapati, L. Waite, K. S. Thilbeault, A. Ronowicz, N. E. Wineinger, H. K. Tiwari, D. Boomsma, M. P. Westerman, J. R. Harris, R. Lyle, M. Essand, F. Eriksson, T. L. Assimes, C. Iribarren, E. Strachan, T. P. O’Hanlon, L. G. Rider, F. W. Miller, V. Giedraitis, L. Lannfelt, M. Ingelsson, A. Piotrowski, N. L. Pedersen, D. Absher, J. P. Dumanski, Age-related somatic structural changes in the nuclear genome of human blood cells. Am. J. Hum. Genet. 90, 217–228 (2012).

  18. L. A. Forsberg, D. Absher, J. P. Dumanski, Non-heritable genetics of human disease: Spotlight on post-zygotic genetic variation acquired during lifetime. J. Med. Genet. 50, 1–10 (2013).

  19. T. Davoli, A. W. Xu, K. E. Mengwasser, L. M. Sack, J. C. Yoon, P. J. Park, S. J. Elledge, Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome. Cell 155, 948–962 (2013).

  20. Schneider-Gädicke, P. Beer-Romero, L. G. Brown, R. Nussbaum, D. C. Page, ZFX has a gene structure similar to ZFY, the putative human sex determinant, and escapes X inactivation. Cell 57, 1247–1258 (1989).

  21. L. J. Zhang, E. S. Shin, Z. X. Yu, S. B. Li, Molecular genetic evidence of Y chromosome loss in male patients with hematological disorders. Chin. Med. J. (Engl.) 120, 2002–2005 (2007).

  22. N. O. Bianchi, Y chromosome structural and functional changes in human malignant diseases. Mutat. Res. 682, 21–27 (2009).

  23. L. C. Silva Veiga, N. A. Bérgamo, P. P. Reis, L. P. Kowalski, S. R. Rogatto, Loss of Y-chromosome does not correlate with age at onset of head and neck carcinoma: A case-control study. Braz. J. Med. Biol. Res. 45, 172–178 (2012).

  24. P. H. Duijf, N. Schultz, R. Benezra, Cancer cells preferentially lose small chromosomes. Int. J. Cancer 132, 2316–2326 (2013).

  25. P. Lichtenstein, U. De Faire, B. Floderus, M. Svartengren, P. Svedberg, N. L. Pedersen, The Swedish Twin Registry: A unique resource for clinical, epidemiological and genetic studies. J. Intern. Med. 252, 184–205 (2002).

  26. K. Chee-Seng, S. Xueling, C. Kee-Seng, in Technical Note: Illumina DNA analysis (Illumina, Inc., San Diego, 2008).

  27. R Development Core Team, R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria, 2012).

  28. Acknowledgments: We thank K. Lindblad-Toh, R. M. Myers, C.-H. Heldin, D. H. Ledbetter, G. J. B. van Ommen, and U. Landegren for critical evaluation of the manuscript. This study was sponsored by the Swedish Cancer Society, the Swedish Research Council, the Swedish Heart-Lung Foundation and Science for Life Laboratory Uppsala to J.P.D. and the Olle Enqvist Byggmästare Foundation to L.A.F. Genotyping and next-generation sequencing were performed by the SNP&SEQ Technology Platform in Uppsala, Sweden, and supported by Wellcome Trust Grants WT098017, WT064890, and WT090532, Uppsala University, Uppsala University Hospital, the Swedish Research Council, and the Swedish Heart-Lung Foundation. The SNP&SEQ Technology Platform is part of Science for Life Laboratory at Uppsala University and supported as a national infrastructure by the Swedish Research Council. C.M.L. is a Wellcome Trust Research Career Development Fellow (086596/Z/08/Z). A.P.M. is a Wellcome Trust Senior Research Fellow in Basic Biomedical Science. A.P.M. acknowledges funding from the Wellcome Trust under awards WT064890, WT090532, and WT098017. TwinGene was supported by the Swedish Research Council (M-2005-1112), GenomEUtwin (EU/QLRT-2001-01254 and QLG2-CT-2002-01254), NIH DK U01-066134, the Swedish Foundation for Strategic Research (SSF), and the Heart and Lung foundation no. 20070481. J.P.D. and L.A.F. are cofounders and shareholders in Cray Innovation AB, as well as co-inventors on Patent Application No. PCT/EP2014/071448, protecting the commercial applications of LOY for the assessment of cancer risk. Genetic variants detected in this study are available at the Database of Genomic Structural Variation (dbVar) under accession code nstd92 for ULSAM and PIVUS cohorts and accession code nstd104 for the TwinGene cohort.