2.6: Assignment- Biological Astronaut - Biology

2.6: Assignment- Biological Astronaut - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Imagine you are an exploratory astronaut looking for life throughout the universe. However, your instruments register movement and a variety of other signs that make you think life exists on the surface.

  • Part 1: Before taking a potentially dangerous trip to the surface, you must outline a theoretical framework in which another element can serve as a backbone for macromolecules. (Hint: look for an element on the periodic table that would act similarly to carbon.) Begin by describing this new backbone, including how compounds and macromolecules would form. Detail at least 2 chemical reactions forming macromolecules with this backbone. You may wish add supporting diagrams (created or obtained). Be sure to include references as appropriate.
  • Part 2: Your theoretical framework is deemed strong enough to justify a trip landside. Once there, you are authorized to collect a simple “organism” for experimental use. Collect your specimen(s) and then design a full experiment that will test at least two characteristics that define biological life on Earth. Be sure to include all the relevant parts of an experiment and describe how you would analyze and present the data, results and conclusions.

Basic Requirements (the assignment will not be accepted or assessed unless the follow criteria have been met):

  • Assignment has been proofread and does not contain any major spelling or grammatical errors
  • Assignment includes appropriate references


Biological Astronaut
Outcomes: Define biology and apply its principles. Identify the principles of chemistry that are integral to biology.
Define atoms and elements through the identification of an element capable of replacing carbon as the backbone atom in macromolecules. Illustrations are encouraged here.Submission identified element that can replace carbon as a backbone element. Illustrations and descriptions included that directly compare the two elements.
5.0 pts
Submission identifies element that can replace carbon as a backbone element. Description directly compares two elements.
4.0 pts
No or wrong element chosen or justification provided is inaccurate/insufficient for identifying replacement backbone element.
0.0 pts
5 pts
Discuss the role electrons play in enabling chosen element to replace carbon.Electrons are discussed in terms of atomic bonding behaviors. Direct comparison is provided between carbon and new element. Illustrations are included to show similarities in bonding activity.
5.0 pts
Electrons are discussed in terms of atomic bonding behaviors. Direct comparison is provided between carbon and new element.
4.0 pts
Electrons are not discussed or are discussed inaccurately in terms of bonding behaviors.
0.0 pts
5 pts
Identify the components of two macromolecule formation reactions and the macromolecules formed. Illustrations/diagrams are encouraged here.Two macromolecules discussed with the new backbone element; all components in the formation of these molecules are correctly identified. Illustrations are included. There is a direct comparison between new macromolecules and traditional macromolecules.
5.0 pts
Two macromolecules discussed with the new backbone element; all components in the formation of these molecules are correctly identified. Illustrations may be included.
4.0 pts
Less than two formation reactions are included or all the components are not properly identified.
0.0 pts
5 pts
Identify at least 4 altered functional groups formed without the use of carbon.At least 4 functional groups identified that would form without carbon. Descriptions are detailed and well diagrammed images/figures included to illustrate these groups.
5.0 pts
At least 4 new functional groups identified that would form without carbon. Descriptions are adequate or basic illustrations included.
4.0 pts
Less than 4 functional groups identified/discussed or new functional groups are inaccurate.
0.0 pts
5 pts
Demonstrate the critical thinking required to conduct a scientific experiment by designing a full experiment to test if an organism is ‘alive’. This experiment contains all the required components of an experiment.All components of an experiment are included and clearly discussed. At least 3 standardized variables included. Hypothesis is clear and testable. Potential compounding factors or areas of uncertainty identified.
5.0 pts
All components of an experiment are included. Hypothesis is clear and testable.
4.0 pts
Experiment does not include all the required components of an experiment or components are inaccurately identified/labeled. Hypothesis is not testable.
0.0 pts
5 pts
The designed experiment tests at least one characteristic that defines biological life.Experiment tests at least one characteristic of biological life and the way this is tested is clearly explained.
5.0 pts
Experiment tests at least one characteristic of biological life.
4.0 pts
Experiment does not test characteristic of biological life.
0.0 pts
5 pts
Possible results from the experiment are discussed, and how it would be concluded this organism is living or not. Organism is also identified as analogous to either prokaryotic or eukaryotic organism.Potential results are discussed and the discussed results logically follow from outlined experiment. Nuances of possible conclusions discussed. Organism identified by cell type.
5.0 pts
Potential results are discussed and the discussed results logically follow from outlined experiment. Organism identified by cell type.
4.0 pts
Potential results are not discussed or the discussed results do not logically follow from outlined experiment. Organism not identified by cell type.
0.0 pts
5 pts
Total points: 35

Biology Assignment Help Online

You have the flexibility to chose an assignment expert that suits your budget and quality parameters. We have more than 2000 PhD experts available to assist with assignments.

Order Preview Before Final Work

You get a preview before making final payment.

Pay Using different channels

You can pay using multiple secure channels, such as PayPal or Credit Cards.

Plagiarism Free Work Guaranteed

We sent unique content with no plagiarism.

Ping Us On Live Chat

You can talk to us anytime around the clock. We are up for the support.

Choose Your Own Expert

We let you chose from the pool of 2000 PhDs tutors.

Go Mobile

You do not need to be on laptop all the time, our mobile interface is great to use.

Background and rationale

Identifying homology relationships between sequences is fundamental to all aspects of biological research. In addition to the pivotal role these inferences play in furthering our understanding of the evolution and diversity of life, they also provide a coherent framework for the extrapolation of biological knowledge between organisms. In this context, orthology inference underpins genome and transcriptome annotation and provides the foundation on which synthetic and systems biology is built. Given the importance of this process to biological research there has been a rich heritage of methodology development in this area with the production of several effective orthology databases and algorithms.

The most widely used methods for orthology inference can be classified into two distinct groups. One group of methods approaches the problem by inferring pairwise relationships between genes in two species, and then extending orthology to multiple species by identifying sets of genes spanning these species in which each gene-pair is an orthologue. Popular methods that adopt this approach include MultiParanoid [1] and OMA [2]. A confounding factor to these approaches is that gene duplications cause orthology relationships that are not one-to-one [3] and so orthology is not a transitive relationship (for example, if gene A is an orthologue of gene B, and gene B is an orthologue of gene C, it is not necessarily true that gene A is an orthologue of gene C) [4]. This lack of transitivity means that to capture all pairwise orthology relationships individual genes must be allowed to be members of more than one set [2], or the gene sets must be restricted to subsets of species that share the same last common ancestor [1]. Methods that adopt these pairwise approaches have high levels of precision in recovering orthologues, however, they suffer from low rates of recall in discovering the complete orthogroup due to these complications arising from gene duplications.

The second group of methods do not adopt this pairwise strategy but rather attempt to identify complete orthogroups an orthogroup is the set of genes that are descended from a single gene in the last common ancestor of all the species being considered [2, 5–9]. Here an orthogroup by definition contains both orthologues and paralogues, and in this context is frequently used as a unit of comparison for comparative genomics [10–12]. In this work we follow this latter approach as it is a logical extension of orthology to multiple species. The most widely used orthogroup inference method is OrthoMCL [13] (usage assessed by citations n = 870 Scopus citations at the time of writing this article). OrthoMCL uses BLAST [14] to compute sequence similarity scores between sequences in multiple species and then uses the MCL clustering algorithm [15] to identify highly-connected clusters (groups of highly similar sequences) within this dataset.

In addition to the approaches discussed above, several methods have also been developed that incorporate gene synteny/co-linearity information to assist in the inference of orthogroups [16, 17]. For groups of organisms such as the Kinetoplastids, where gene synteny/co-linearity is well conserved [18] it can provide valuable additional information. However, synteny is not conserved over large evolutionary distances and thus can provide little assistance to the identification of related genes between distantly related groups such as plants and metazoa. Moreover, synteny is unavailable for de novo assembled transcriptomes and for fragmented, low-coverage genome assemblies. Thus there is a need to have accurate methods of orthogroup inference that do not require gene synteny information.

Here we present OrthoFinder, a novel method that infers orthogroups of protein coding genes. It is fast, easy to use and scalable to thousands of genomes. In tests using real benchmark datasets OrthoFinder outperforms all other commonly used orthogroup inference methods by between 8 % and 33 %. We further demonstrate the utility of OrthoFinder through the inference and analysis of plant transcription factor orthogroups. Here we use phylogenetic methods to validate the orthogroups and show that using OrthoFinder to infer orthogroups identifies millions of previously unobserved relationships. Further information about the algorithm can be found at [19] and a standalone implementation of the algorithm is available under the GPLv3 licence at [20].

Problem definition, method evaluation and comparison to other approaches

Gene length bias in BLAST scores affects the accuracy of orthogroup detection

The inference of orthogroups across multiple species requires a fast method to measure pairwise sequence similarity between all sequences in the species being considered. BLAST [14] is the most widely used method to identify and measure similarity between sequences and thus it underpins the majority of orthologue identification methods [9, 13, 21–23]. Analysis of the pairwise BLAST scores that are produced when the full set of protein sequences from one species is BLAST searched against those from another species revealed that there is a clear length dependency in the scores that are obtained (Fig. 1a and b). Short sequences cannot produce large bit scores or low e-values (Fig. 1a and b, respectively), whereas long sequences produce many hits with scores better than those for the best hits of short sequences (Fig. 1a and b). Thus, methods that construct orthogroups by evaluation of BLAST scores in the absence of gene length information should result in a large number of missing genes (low recall) from orthogroups that contain short genes and a large number of incorrectly clustered genes (low precision) in orthogroups that contain long genes.

Analysis of gene length dependency of BLASTp scores. a BLAST log10(bit score) for all hits between Homo sapiens (Homo_sapiens.GRCh37.60.pep.all, 21,841 sequences) and Mus musculus (Mus_musculus.NCBIM37.60.pep.all, 23,111 sequences). b –log10(e-value) for all hits between and Homo sapiens and Mus musculus. To avoid infinite values, BLAST scores of zero have been replaced with the lowest obtainable value 10 −180 . The heat map in both cases goes from blue (lowest density of hits) to red (highest). c The F-score (red), recall (blue) and precision (green) of orthogroup inference using OrthoMCL plotted as a function of sequence length. The sequences were sorted according to length and divided into four bins with the same number of sequences in each. The F-score, recall and precision were calculated for each bin and the scores plotted against the geometric mean of the length of the sequences in each bin. The error bars show the lower and upper limits of sequence lengths for the shortest and longest sequences in each bin and the dot shows the geometric mean of these lengths. d Histogram of all protein-coding gene lengths in Homo sapiens is provided for reference

To determine if this was the case we assessed the performance of OrthoMCL using the OrthoBench dataset [5]. OrthoBench is the only publicly available benchmark dataset of manually curated orthogroups. The dataset consists of 70 orthogroups of protein coding genes covering 12 species within the Metazoa where each orthogroup contains all the genes derived from a single gene in the last common ancestor of the 12 species considered. For further details concerning the construction, species range and complexity of each orthogroup see [5]. The recall and precision of OrthoMCL was assessed as a function of gene length in this dataset. This revealed that there were strong dependencies between the performance characteristics of OrthoMCL and the length of the gene that was being clustered (Fig. 1c, Additional file 1: Table S1). Specifically, short sequences suffer from low recall rate (that is, many short sequences fail to be assigned to an orthogroup) and long sequences suffer from low precision (that is, many long sequences are assigned to the incorrect orthogroup) as predicted from the analysis of BLAST scores above. To put these results in perspective the distribution of protein lengths in Homo sapiens is provided in Fig. 1d.

A novel score transform eliminates gene length bias in orthogroup detection

Given that orthogroup inference shows a clear gene length dependency, we sought to develop a transform of the BLAST scores that would reduce the impact of gene length on clustering accuracy. To do this we developed a novel method that determines the gene length dependency of a given pairwise species comparison from an analysis of the bit scores from an all-versus-all BLAST search between the two species. Bit scores were used in place of e-values as the e-value calculation enforces a limit of 1×10 −180 and thus all scores below this floor are given the same value (that is, 0) (Fig. 1b) and thus length bias in e-values is non-uniform and irreversible. As bit scores do not have a threshold value, and they have been previously shown to be capable of facilitating accurate inference of phylogenetic trees [24], they were selected as the raw data for the development of a novel score transform.

In brief, for each species-pair in turn, the all-vs-all BLAST hits (Fig. 2a) were divided into equal sized bins of increasing sequence length according to the product of the query and hit sequence lengths. The top 5 % of hits in each bin (ranked according to BLAST bit score) were used to represent ‘good’ hits for sequences of that length bin between the given species pair (Fig. 2b). A linear model in log-log space was used to fit a line to these scores using least squares fitting (Fig. 2b). All of the BLAST bit scores that were obtained from each species-pair all-vs-all BLAST search are then transformed using this model so that the best hits between sequences in this species pair have equivalent scores that are independent of sequence length (Fig. 2c and d). Following the transform the poor quality hits for longer sequences were no longer better than the best quality hits for short sequences (Fig. 2c). This normalisation procedure is applied to each pairwise species comparison independently as the behaviour of the BLAST scores is different for each pairwise species comparison (Additional file 2: Figure S1). Importantly, this pairwise length normalisation between species also normalises for phylogenetic distance between species (See ‘Gene length and phylogenetic distance normalisation’ & Additional file 2: Figure S1). Specifically, the normalisation ensures that the best scoring hits between distantly related species achieve the same scores (on average) to the best scoring hits between closely related species (Additional file 2: Figure S1). These length and phylogenetic distance normalised scores were then used as the measure of sequence similarity on which all subsequent analysis and clustering were performed.

The gene length and phylogenetic distance normalisation procedure for a single species pair. a BLAST bit scores for all hits between Homo sapiens and Mus musculus. b BLAST bit scores for the top 5 % of BLAST hits with least-squares fit of the equation log10 B qh = a log10 L qh + b., where B qh is the bit score for the hit between sequence q and sequence h and L qh is the product of the gene lengths (measured in amino acids). c Gene length and phylogenetic distance normalised BLAST bit scores. Note that there are a large number of poor scoring hits for long sequences due to these hits exceeding the BLAST search e-value cutoff. d The same top 5 % of BLAST hits as shown in (b) after normalisation for reference

Application of this novel score transform prior to clustering of the OrthoBench dataset resulted in a dramatic reduction in the length dependency of the clustering results (Fig. 3). Unlike OrthoMCL (Fig. 3a), neither precision, recall nor F-score displayed any dependency on gene length (Fig. 3b). Moreover, precision was substantially increased over the entire range of sequence lengths (Fig. 3b).

Comparison of OrthoFinder to other orthogroup inference methods. a The length dependency of OrthoMCL. b The length dependency of OrthoMCL using our normalised similarity scores. c The length dependency of the complete OrthoFinder algorithm. For A-C scores were calculated as in Fig. 1c. d Comparison of the results of OrthoFinder F-score with all other methods tested in OrthoBench. e As in (d) but for recall. f As in (d) but for precision. The error bars show the lower and upper limits of sequence lengths corresponding to the shortest and longest sequences in each bin and the dot shows the geometric mean of these lengths

An improved method for orthogroup delimitation improves overall accuracy

Given that we had reduced gene length bias and that precision was high but recall was low, we assessed whether a method that could identify a higher proportion of cognate gene-pairs prior to clustering could produce an overall increase in clustering accuracy. Many orthology assignment methods make use of reciprocal best BLAST hit (RBH) as it is widely regarded as a high precision method for the identification of orthologues gene-pairs [25–27]. Therefore we also sought to use reciprocal best BLAST hits using our new length-normalised score to assist in construction of the orthogroup graph. Henceforth, we refer to a reciprocal best hit that is obtained using the length-normalised score as an RBNH (reciprocal best normalised hit).

In brief, for each gene that had successfully identified one or more RBNHs, the scores for these RBNHs were used to delimit an inclusion threshold (see methods). As all scores are normalised for gene length and phylogenetic distance, hits to other genes (in any species) that had scores above this inclusion threshold were included as putative cognate gene-pairs and added to the orthogroup graph that was subjected to MCL clustering (for further details see methods). This novel data selection criterion resulted in a dramatic improvement in overall clustering accuracy while maintaining gene length independence (Fig. 3c). The overall results for OrthoFinder, were 0.85 precision, 0.81 recall and 0.83 F-score.

OrthoFinder outperforms all other methods from the OrthoBench analysis

Given that OrthoFinder exhibited high accuracy on the benchmark dataset we sought to determine the relative performance to other commonly used methods for orthogroup inference. OrthoFinder outperformed all other methods that have been applied to OrthoBench [5] as measured by F-score (Fig. 3d), performing 8 % better than TreeFam (the next best method) 25 % better than OrthoMCL (the most widely used method), and 33 % better than OMA (the lowest scoring method in this test). Importantly, the precision and recall of OrthoFinder were balanced, demonstrating that the method is not biased towards over- or under-clustering of sequences. It should be noted that OMA exhibits a low recall in this test as its goal is to identify orthologues instead of complete orthogroups and thus paralogues will be absent from the orthologue groups identified by this method. OMA is included here for completeness as it was included in the original OrthoBench analysis [5].

In addition to accuracy, a number of other criteria were used to compare the performance of the different inference methods in the OrthoBench paper. These criteria included the percentage of orthogroups predicted without any errors, the number of erroneously assigned genes (that is, false positives, and thus also captured by the precision) and missing genes (that is, false negatives, and thus also captured by recall) in the assignment of genes to orthogroups and the proportion of orthogroups affected by these false positive and false negatives. The results for OrthoFinder according to these criteria are reported in Additional file 3: Figure S2 and are consistent with the increased accuracy of OrthoFinder compared to other methods. Additionally, the 70 orthogroups that make up the OrthoBench dataset comprise 40 that represent particular biological or technical challenges and 30 randomly chosen orthogroups. Additional file 4: Figure S3 shows the F-scores for these two categories separately to illustrate the difference in performance of the method for ‘randomly selected’ and ‘difficult’ orthogroups. OrthoFinder outperformed all other methods in both categories and achieved an F-score of 81 % and 90 % on the difficult and randomly selected orthogroups, respectively.

OrthoFinder is suitable for the analysis of incomplete datasets

As many research groups are producing partial genome assemblies and transcriptome resources it is to be expected that sequence datasets will be missing genes due to incomplete assembly, low expression or errors in gene prediction. To demonstrate the suitability of OrthoFinder for analysing these incomplete datasets we assessed the performance of OrthoFinder with between 5 % and 60 % of genes deleted at random from the OrthoBench input sequences. This revealed that the accuracy of OrthoFinder is robust to missing data and that it achieved an F-score of over of 0.75 even when 60 % of the genes were missing from the input dataset (Additional file 5: Figure S4). Thus OrthoFinder is suitable for orthogroup inference from partial and incomplete datasets.

OrthoFinder is fast and scalable

The number of species for which genome or transcriptome sequence resources are available is increasing rapidly and there is a corresponding need to be able to infer orthogroups using these datasets as they emerge. To keep pace with these increasing demands the algorithm utilises sparse matrices as the central data structure and performs many steps using matrix operations. For example, starting from pre-computed raw BLAST scores the identification of orthogroups for the OrthoBench dataset (12 species, 235,033 sequences) takes 14 min 20 s using OrthoFinder on a single core of an Intel Core i7-4770 3.4GHz CPU. For comparison, OrthoMCL takes 20 h 12 min to perform the same operation using the same CPU and MySQL for its relational database management system. As the number of genomes that must be analysed increases, the scalability of the methods used becomes increasingly important. To demonstrate the scalability performance of OrthoFinder, the full set of sequenced plant genomes from Phytozome version 9.0 (n = 41 [28]) were clustered and the results are shown in Fig. 4. Plant genomes were selected for this test as they are large with an average of 30,731 protein coding genes per species in Phytozome version 9.0 and thus they represent a stringent assessment of the scalability of OrthoFinder. The memory (RAM) requirements increase linearly with the number of species clustered (Fig. 4a). This is despite the fact that the number of BLAST hits increases quadratically with the number of species (Fig. 4c). This linear scaling is achieved by processing the BLAST hits for each species sequentially and independently within OrthoFinder. Though the memory requirements increase linearly, the time requirements starting from pre-computed raw BLAST scores increases quadratically with the number of species (Fig. 4b). This is to be expected as the number of BLAST hits that must be processed also increases quadratically. For example, identifying the orthogroups for all 41 plant species from Phytozome requires approximately 4 GB of RAM and took approximately 3 h on a single CPU core. Fitting the data to a line and extrapolating we estimate that approximately 450 plant sized genomes can be clustered on a linux computer with 64GB of RAM (Fig. 4a). Thus OrthoFinder is capable of large analyses on conventional computing resources. It should be noted here that the BLAST searches incur the largest computational cost in any orthogroup inference analysis and that this cost is the same for all inference methods that use BLAST. In summary OrthoFinder is fast and scalable to hundreds of species on conventional computing resources.

Memory and time requirements of OrthoFinder. Sub-samples of between two and 41 plant genomes from Phytozome version 9.0 given pre-calculated BLAST results. The average number of genes per species was 30,731. a Maximum RAM requirements. b Time requirements. c The number of BLAST hits that must be processed for a given number of species (provided to place the time and RAM requirements into context)

Inference of high accuracy plant transcription factor orthogroups

Given that OrthoFinder has increased accuracy over other methods and that gene length bias has been eliminated from orthogroup inference, we sought to provide an additional demonstration of the utility of OrthoFinder for the inference of orthogroups. To do this we selected plant transcription factors as they are short genes and will thus suffer from low rates of recall in assignment to orthogroups in the absence of gene length bias correction. Moreover transcription factor genes are preferentially retained following whole genome duplication events [29, 30] and thus transcription factor orthogroups are larger than average and contain multiple independent duplication events in multiple independent lineages that can cause some inference methods to fail. Finally, previous efforts to define transcription factor orthogroups have utilised OrthoMCL [31]. Thus current transcription factor orthogroups will have low recall resulting in fragmented orthogroups spanning few species.

Using established rules for the identification and classification of transcription factors [31] we identified and typed all of the transcription factors present in the 41 genomes present in Phytozome v9. The complete predicted proteomes from these 41 genomes were then subject to clustering using OrthoFinder and OrthoMCL and the distribution of transcription factors in the resultant orthogroups were analysed. OrthoMCL was used here as it is the method by which all transcription factor families are currently classified [31]. Consistent with the increased recall rate for OrthoFinder, analysis of the resulting orthogroups revealed that 8.5 % more transcription factors were placed in orthogroups using OrthoFinder than OrthoMCL (Fig. 5a, 97.6 % and 89.1 %, respectively, n = 52,744). Also consistent with the increased recall rate is that these orthogroups were less fragmented than those that were produced by OrthoMCL (Fig. 5b, 897 and 3,024 orthogroups, respectively). Importantly, the orthogroups inferred using OrthoFinder were missing fewer RBHs (Fig. 5c, 2.15 % and 5.77 %, respectively) and clustered more of the same type of transcription factor together (Fig. 5d and e). A major difference between those orthogroups inferred using OrthoFinder and OrthoMCL is that those produced by OrthoFinder encompass a larger number of species than those recovered by OrthoMCL (Fig. 5f), thus orthogroups produced by OrthoFinder encompass greater phylogenetic distances.

Inference of orthogroups of plant transcription factors. In all cases dark grey bars indicate the results for OrthoFinder and light grey bars indicate the results for OrthoMCL. a Comparison of the fraction of transcription factors that are assigned to orthogroups by OrthoFinder and by OrthoMCL. b Comparison of the number of transcription factor orthogroups identified using each method. c The percentage of RBNH/RBH (for OrthoFinder/OrthoMCL) hits that are not contained in orthogroups identified using each method. d The number of transcription factors of the same type that each transcription factor is connected to in the orthogroups produced by OrthoFinder. e as in (d) but for OrthoMCL. f Comparison of species coverage for transcription factor orthogroups identified by each method. g The number of orthogroups for each transcription factor type identified by OrthoFinder

As OrthoFinder clustered the transcription factors together into far fewer orthogroups than OrthoMCL (897 versus 3024) we sought to demonstrate that it was correct in doing so. To do this we used gene-tree/species-tree reconciliation to determine if the orthogroups were true orthogroups if they incorrectly clustered sequences that are separated by a gene duplication event that occurred before the last common ancestor of the species in the analysis. Overall, 858 of the 897 OrthoFinder orthogroups (96 %) consisted entirely of genes that were correctly clustered together and only 39 contained some genes that were separated by a duplication prior to the last common ancestor (Additional file 6: Table S2 and Additional file 7: Table S3). Of the 897 OrthoFinder orthogroups, 210 were identical to ones from OrthoMCL and 471 OrthoFinder orthogroups were strict supersets of 2,271 OrthoMCL orthogroups (Additional file 6: Table S2 and Additional file 7: Table S3). Of these, 90 % (425) were true orthogroups that each encompassed on average four OrthoMCL orthogroups (1,709 in total).

An illustrated example showing an OrthoFinder orthogroup and its constituent OrthoMCL orthogroups is provided in Fig. 6. Here the OrthoFinder orthogroup (labelled bHLH 8 in Additional file 6: Table S2) contains all known type IVc bHLH transcription factors [32]. Type IVc bHLH transcription factors have previously been shown to be conserved from green algae to land plants and thus span the complete taxonomic range contained in this analysis [32]. The OrthoFinder orthogroup correctly united eight paraphyletic OrthoMCL orthogroups and included 36 transcription factors (highlighted in grey) that were not clustered into any orthogroups by OrthoMCL (Fig. 6). The phylogenetic tree shows that there are no genes present in this OrthoFinder orthogroup that were the product of a gene duplication event prior to the divergence of the last common ancestor of all species in the analysis. This is only one example and the complete set of phylogenetic trees for each OrthoFinder transcription factor orthogroup are provided in Additional file 6: Table S2 along with the OrthoMCL subsets that comprise these groups where appropriate. Also contained in this table are the results of the gene-tree/species-tree reconciliation for each tree inferred from an OrthoFinder orthogroup.

A bootstrapped maximum likelihood phylogenetic tree of the OrthoFinder orthogroup containing the type IVc bHLH transcription factors (bHLH 8). The OrthoMCL orthogroups that are subsets of the OrthoFinder orthogroup are indicated by different coloured fonts. Thirty-six of the OrthoFinder clustered genes (coloured grey) failed to be clustered in any OrthoMCL orthogroup. The tree was inferred using RAxML using the PROTGAMMAAUTO model (the JTT was model was selected as having the highest likelihood) with 100 bootstrap replicates. Scale bar indicates the number of substitutions per site. Percentage bootstrap support values are indicated by coloured circles shown at internal nodes

Taken together, using OrthoFinder to cluster transcription factor genes resulted in the identification 687 (897 less the 210 that were the same) novel orthogroups of transcription factors across 41 different species comprising 7.7 million pairwise relationships (of which 6.9 million are not detected by OrthoMCL). Thus using OrthoFinder to cluster transcription factors has provided significant new insight into the relationship of transcription factor genes across plants. The number of orthogroups for each transcription factor type is provided in Fig. 5g and the full classification including all constituent accession numbers is provided in Additional file 6: Table S2.

Algorithm implementation and evaluation criteria

OrthoFinder is an algorithm that infers orthogroups across multiple species. The method does not classify the pairwise relationships that exist between genes within these orthogroups. The method does not require synteny information and is thus equally suitable for clustering protein sequences predicted from genome or transcriptome datasets. OrthoFinder is run with a single command and requires as input a directory containing one protein sequence FASTA file per species to be clustered. OrthoFinder does not require preprocessing of FASTA files (such as filtering of sequences) and does not require knowledge or use of any relational database management system such as MySQL. It outputs orthogroups in two file formats: the Quest for Orthologs community standard OrthoXML [33] and in plain text format with one orthogroup per line.

There are two common problem definitions used by the majority of homology inference algorithms. One is to predict pairs of orthologues (pairs of genes from two different species descendent from a single gene in the last common ancestor of the two species) and pairs of recent, within-species paralogues (genes-pairs arising from a duplication event since the last speciation event for that species). The other approach, and the one used here for OrthoFinder, is to predict orthogroups. An orthogroup is the set of genes derived from a single gene in the last common ancestor of all the species under consideration. This is the approach used by OrthoMCL [13] and eggNOG [34]. OrthoFinder follows this second approach to produce orthogroups of protein coding genes as this is a logical extension of orthology to multiple species as it groups all genes descended from a single gene in the last common ancestor of all species being considered.

Methodological overview of the OrthoFinder algorithm

An overview of the algorithm in shown in Fig. 7, it proceeds in five phases corresponding to sections b-f in the figure:

Overview of the steps in the OrthoFinder algorithm for two example orthogroups of genes from three species. a The unknown orthogroups that the algorithm must recover, shown as a gene tree. b BLAST search of all genes against all genes. c Gene length and phylogenetic distance normalisation of BLAST bit scores to give the scores to be used for orthogroup inference. d Selection of putative cognate gene-pairs from normalised BLAST scores. e Construction of orthogroup graph, genes are nodes in the graph and pairs of genes are connected by an edge with edge weights given by the normalised bit score. f Clustering of genes into discrete orthogroups using MCL

BLAST all-versus-all search (Fig. 7b ). Protein BLAST (blastp) with an e-value threshold of 10 −3 is used so as to avoid discarding putative good hits for very short sequences. A relaxed threshold is used at this stage of the method as subsequent steps filter out false positive hits using stringent, orthogroup-specific criteria for inclusion (described below).

Gene length and phylogenetic distance normalisation of the BLAST bit scores (Fig. 7c ). This step models the all-vs-all BLAST hits for each pairwise comparison between species to identify and remove the gene similarity dependency on gene length and phylogenetic distance. This is done so that the best hits between all species achieve the same scores regardless of sequence length or phylogenetic distance.

Delimitation of orthogroup sequence similarity thresholds using RBNHs (Fig. 7d ). This step uses information from RBNHs (Reciprocal Best length-Normalised hit) to define the lower limit of sequence similarity for putative cognate genes of each query gene. To be included in the orthogroup graph a gene-pair must be an RBNH or produce a hit that is better scoring than the lowest scoring RBNH (irrespective of species) for either gene.

Constructing an orthogroup graph for input into MCL (Fig. 7e ). Putative cognate gene-pairs are identified as above and are connected in the orthogroup graph with weights given by the normalised BLAST bit scores.

Clustering of genes into orthogroups using MCL (Fig. 7f ).

The steps 2 to 4 are the novel parts of our algorithm and are described in detail below.

Gene length and phylogenetic distance normalisation

The aim of this normalisation procedure is to remove gene length bias from BLAST bit scores and to normalise for phylogenetic distance between species. MCL converts sets of similarity scores into clusters by breaking apart clusters of genes that have low similarity scores (and therefore are unlikely to be orthogroups) and preserving clusters of sequences that have high similarity scores. If the similarity scores between long sequences are inherently larger than the similarity scores between short sequences then the clustering will preferentially break apart clusters of short sequences while preserving clusters of long sequences. This effect can be clearly seen in the results of a typical OrthoMCL cluster. Here, long sequences are placed in overly large clusters leading to low precision, and short sequences remain un-clustered leading to low recall (Fig. 3a). The species-wise normalisation implemented by OrthoFinder similarly ensures that orthologues from more distant species (that have inherently lower similarity scores due to phylogenetic distance) are not preferentially discarded and is similar to a step that is performed in OrthoMCL wherein all scores are divided by the average score between a given species pair [13].

Previous methods have exploited BLAST e-values (rather than bit scores) as a measure of similarity between sequences. However, as can be seen in Fig. 1b the use of e-values for assessment of similarity between sequences is flawed. Here, the minimum e-value that can be obtained for a given query sequence decreases with increasing sequence length until, at a certain length, the lower bound for e-values is reached and BLAST returns an e-value of 0. This creates two problems: (1) long sequences will frequently have low quality hits with better e-values than the best possible hits of short sequences and (2) the floor value for the e-value calculation means that length bias is non-uniform and thus irreversible. Specifically, beyond the floor-value e-values cannot be used to distinguish between the qualities of hits as they are all assigned the same e-value. As can be seen in the heat map shown in Fig. 1b, many hits obtain this floor-value for a given pairwise species comparison and thus their similarities are indistinguishable. This length-bias must be removed to prevent biasing downstream clustering applications.

In this method we construct a similarity measure between sequences based on the bit-score normalised to take into account the query and hit sequences lengths and the phylogenetic distance between species. Unlike e-values, the bit-scores do not suffer from the presence of a threshold limit and thus different amounts of sequence similarity can be distinguished regardless of the lengths of the sequences involved. Let L q be the length of the query sequence and L h be the length of the hit sequence. In an analogous manner to the e-value calculation made by BLAST and other sequence comparison methods, we use the variable L qh = L q L h to quantify the lengths of a pair of sequences that are being compared.

The length normalisation procedure is shown in Fig. 2. For each species pair, we:

Sort all BLAST hits according to L qh = L q L h.

Put the hits into equal sized bins of 1,000 hits (put the ‘shortest’ 1,000 hits according to L qh into one bin, the next 1,000 hits into the next bin and so on for all the hits). If there are fewer than 5,000 hits then we put the hits into bins of 200. Using fixed sized bins means that it is not necessary for the algorithm to specify the location of the bins in advance.

Sort the hits in each bin according to BLAST bit score and select the top 5 % of hits from each bin. Find the parameters a and b that best describe the fit between sequence similarity scores and gene length for the selected hits using the equation log10 B qh = a log10 L qh + b where B qh is the BLAST bit score between sequences q and h.

Normalise all obtained BLAST bit scores (not just the top 5 %) between the given species pair according to, B qh ' = B qh/10 b L qh a , so that B ' qh, (the normalised score) is the BLAST bit score for a hit divided by the BLAST bit score that would be expected for the best hits between sequences of that length for the species pair under consideration.

The top 5 % of hits are used rather than RBHs as selection of RBHs will be affected by the gene length-bias that we wish to correct. Moreover, gene duplication events can frequently cause RBHs to fail (Additional file 8: Figure S5) and thus reduce the number of data points that are available for fitting. The normalisation procedure ensures that the best hits between a given species pair achieve (on average) the same scores irrespective of their gene length.

OrthoFinder also normalises for phylogenetic distance, this is done so that the similarity scores between orthologues will be independent of phylogenetic distance (that is, the true orthologues in distantly related species will obtain similar scores to the true orthologues in closely related species). If this step is not done then true orthologues in distantly related species will always obtain lower scores than true orthologues in closely related species. Thus during graph clustering (which is unaware of phylogenetic relationship between species) distantly related true orthologues (and cognates) will become disconnected from each other more easily than closely related true orthologues (and cognates) in the orthogroup graph. Previous efforts to prevent this phylogenetic bias include dividing the observed similarity score for any given gene-pair by the mean similarity score observed for all reciprocal best hits between genes in that species pair [13]. However, in the absence of gene length information this means that short genes will always be penalised more than long genes.

Though there is precedent for the use of L qh = L q L h to quantify the lengths of a pair of sequences that are being compared [14], different functions for gene length normalisation were also assessed. All other functions, including for example the use of the variable ( < ilde>_=_q+_h ) in place of L qh, gave a lower overall clustering accuracy.

Identification of putative cognate gene-pairs for inclusion in the orthogroup graph

Once scores are normalised OrthoFinder exploits RBNHs to identify putative cognate gene-pairs. RBHs are a high precision method to identify putative orthologues [25–27] and OrthoFinder uses the reciprocal requirement exploiting its length and phylogenetic distance normalised BLAST scores. For each gene the scores for its RBNHs are used to delimit the extent of sequence similarity of that gene’s orthogroup. Specifically, for each query sequence, q, any hit, h, with a normalised score, B ' qh, greater than or equal to the score for the lowest scoring RBNH of q is selected as a putative cognate gene-pair of q and therefore is connected to q in the orthogroup graph that is subsequently subjected to MCL clustering.

The rationale for this approach is that the level of normalised similarity of a query gene and its RBNHs can be used to estimate the extent of similarity of other genes within the same orthogroup. All genes more similar to a query gene than any of the query gene’s RBNHs (irrespective of species) are likely members of the same orthogroup. Therefore, the normalised similarity score for the most dissimilar RBNH of a gene is used as a cutoff for inclusion of additional cognate gene-pairs from all species. That is q is connected to h in the orthogroup graph if B ' qh > B ' qR where R is an RBNH of q. This provides a simple and robust method for recovering cognate gene-pairs that may otherwise be difficult to detect due to duplication events that can cause the RBNH method to fail. Further details, explanation and worked examples are provided in Additional file 8: Figure S5.

In summary, the novel method presented here generates, for each query gene, an independent prediction of all the genes in its orthogroup. This orthogroup graph is then clustered using MCL with its default inflation parameter of 1.5. The effect of varying the MCL inflation parameter on the OrthoFinder result is shown in Additional file 9: Figure S6. The F-score of OrthoFinder is relatively stable to variation in MCL inflation parameter, however it is possible to trade precision against recall by varying this parameter (Additional file 9: Figure S6). For comparison the analogous analysis is also presented for OrhtoMCL (Additional file 10: Figure S7).


OrthoFinder is written in python. It requires python together with the numpy and scipy libraries [35] to be installed. OrthoFinder requires the standalone BLAST+ and MCL algorithms that are freely available. These standalone applications must be installed separately to OrthoFinder and are not included in the OrthoFinder package. The implementation makes use of sparse matrices to store hits between sequences. This provides a memory efficient method of storing the data and allows key parts of the algorithm to be expressed using scipy’s highly optimised C++ implementations of sparse matrix operations. OrthoFinder can either run the BLAST searches for you or can be run on pre-computed BLAST searches. If you chose to run BLAST searches independently then instructions are provided in the documentation for how to process your sequence names in the pre-computed BLAST output. Similarly OrthoFinder will also automatically run MCL for you. However if you wish to run MCL separately using different parameter settings then the MCL input files are stored for this purpose in a working directory.


OrthoBench [5] is the only manually curated dataset of orthogroups for the assessment of orthogroup prediction algorithms. It was used in this work for assessing OrthoFinder as it has been independently evaluated, it underpins the testing of multiple different methods and it is a well-defined and stringent test of the problem that OrthoFinder was designed to solve. Criteria such as functional similarity within orthogroups, expressed for example using enzyme classification numbers [36], were not used in this work since not all proteins with the same function are members of the same orthogroup and members of the same orthogroup do not necessarily all have the same function. As we are using real benchmark datasets for which only a subset of sequences have been assigned to ‘true’ gene families the extent of true negative orthologue assignments is unknown (as is the case for all methods tested on this dataset). Thus we cannot use the Matthews correlation coefficient to assess the performance of the orthogroup inference methods. In the absence of this information the simplest and most transparent evaluation of the accuracy of any prediction method is to measure its precision and recall.

Where TP is the number of true positive orthogroup assignments (that is, correct assignments), FP is the number of false positive orthogroup assignments (that is, incorrect assignments) and FN is the number of false negative orthologue assignments (that is, missing assignments). The F-score is the harmonic mean of these two measures, where the harmonic mean weights towards the worst performing measure. We also provide other evaluation measures from the original OrthoBench analysis in Additional file 3: Figure S2.

Inference of transcription factor orthogroups

To infer transcription factor orthogroups we first identified the set of transcription factors present in all genomes present in Phytozome V9. This identification was performed using the same rules for the presence and absence of PFAM domains as has been previously described [31]. The full set of protein coding genes from these genomes (including all the transcription factors) was then clustered using OrthoFinder and OrthoMCL and the distribution of the transcription factors within these orthogroups was analysed. OrthoMCL was selected for comparison here as it is the method by which all orthogroups of transcription factors are currently defined [31]. An orthogroup of transcription factors was defined as an orthogroup whose constituent genes comprised ≥50 % transcription factors of the same domain classification.

To determine if OrthoFinder was correct in combining multiple separate OrthoMCL orthogroups each orthogroup was subject to gene-tree—species-tree reconciliation. Using, gene-tree species-tree reconciliation it is possible to determine if OrthoFinder had incorrectly placed together genes that had diverged prior to the last common ancestor of the species being analysed. To do this, gene trees were inferred for each orthogroup by aligning the sequences using mafft-linsi [37] and inferring a maximum likelihood tree from this alignment using FastTree [38]. DLCpar [39] was used to reconcile these gene trees with the known species tree [28]. Using this method, each gene tree was assessed to determine if it contained bipartitions that occurred prior to the divergence of the last common ancestor of all the species being analysed. If such a bipartition was identified then the orthogroup was considered not to be a true orthogroup as it contained one or more genes that evolved by duplication prior to the last common ancestor of all species under consideration.

Biologica astronaut

Imagine you are an exploratory astronaut looking for life throughout the universe. One day you encounter a planet that has no carbon present on its surface. However, your instruments register movement and a variety of other signs that make you think life exists on the surface. Part 1: Before taking a potentially dangerous trip to the surface, you must outline a theoretical framework in which another element can serve as a backbone for macromolecules. (Hint: look for an element on the periodic table that would act similarly to carbon.) Begin by describing this new backbone, including how compounds and macromolecules would form. Detail at least 2 chemical reactions forming macromolecules with this backbone. You may wish add supporting diagrams (created or obtained). Be sure to include references as appropriate. Part 2: Your theoretical framework is deemed strong enough to justify a trip landside. Once there, you are authorized to collect a simple &ldquoorganism&rdquo for experimental use. Collect your specimen(s) and then design a full experiment that will test at least two characteristics that define biological life on Earth. Be sure to include all the relevant parts of an experiment and describe how you would analyze and present the data, results and conclusions.

Why Is This Important?

Prior to the implementation of these unique suffixes to the proper names of biological products, it was difficult to fully track adverse events for a specific manufacturer’s biologic if that product shared the same proper name of another biologic. 1 The enhanced pharmacovigilance that is expected to result from these unique suffixes will allow for streamlined tracking of adverse events to a specific manufacturer, lot number, and/or manufacturing site. 1 The unique suffix now allows for biological product differentiation.

There are concerns that many healthcare providers and patients assume that naming for biological products follows the same naming concepts as the naming of small-molecule drugs. If a small-molecule medication possesses the same generic name, they are more than likely interchangeable. However, this is not the case with biological products. A biological product possessing the same generic name does not infer interchangeability. 2 The Purple Book includes guidance on all licensed biological products, including whether a biological product has been determined by the FDA to be interchangeable or if a biosimilar is interchangeable with the referenced product. 9 The Purple Book was launched in 2014, is managed by the FDA, became searchable in February 2020, and is available at:

Genetically Encoded Fluorescent Proteins Enable High-Throughput Assignment of Cell Cohorts Directly from MALDI-MS Images

Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry imaging (MSI) provides a unique in situ chemical profile that can include drugs, nucleic acids, metabolites, lipids, and proteins. MSI of individual cells (of a known cell type) affords a unique insight into normal and disease-related processes and is a prerequisite for combining the results of MSI and other single-cell modalities (e.g. mass cytometry and next-generation sequencing). Technological barriers have prevented the high-throughput assignment of MSI spectra from solid tissue preparations to their cell type. These barriers include obtaining a suitable cell-identifying image (e.g. immunohistochemistry) and obtaining sufficiently accurate registration of the cell-identifying and MALDI-MS images. This study introduces a technique that overcame these barriers by assigning cell type directly from mass spectra. We hypothesized that, in MSI from mice with a defined fluorescent protein expression pattern, the fluorescent protein's molecular ion could be used to identify cell cohorts. A method was developed for the purification of enhanced yellow fluorescent protein (EYFP) from mice. To determine EYFP's molecular mass for MSI studies, we performed intact mass analysis and characterized the protein's primary structure and post-translational modifications through various techniques. MALDI-MSI methods were developed to enhance the detection of EYFP in situ, and by extraction of EYFP's molecular ion from MALDI-MS images, automated, whole-image assignment of cell cohorts was achieved. This method was validated using a well-characterized mouse line that expresses EYFP in motor and sensory neurons and should be applicable to hundreds of commercially available mice (and other animal) strains comprising a multitude of cell-specific fluorescent labels.


Genetically-encoded fluorescence enables the detection…

Genetically-encoded fluorescence enables the detection of cell cohorts in situ . Quantitative laser…

Isolation of EYFP proteoforms from…

Isolation of EYFP proteoforms from YFP-16 mouse brains. A) Fluorimetry at 527 nm…

Characterization of EYFP. Top-down protein…

Characterization of EYFP. Top-down protein characterization, peptide mass fingerprinting, and LC-MS/MS analysis, together…

Optimized in situ detection of…

Optimized in situ detection of EYFP in MALDI-MS using two different automated deposition…

MALDI-MS images in register with…

MALDI-MS images in register with the fluorescence image of a YFP-16 mouse brain…


The word "protozoa" (singular protozoon or protozoan) was coined in 1818 by zoologist Georg August Goldfuss, as the Greek equivalent of the German Urthiere, meaning "primitive, or original animals" (ur- ‘proto-’ + Thier ‘animal’). [12] Goldfuss created Protozoa as a class containing what he believed to be the simplest animals. [6] Originally, the group included not only single-celled microorganisms but also some "lower" multicellular animals, such as rotifers, corals, sponges, jellyfish, bryozoa and polychaete worms. [13] The term Protozoa is formed from the Greek words πρῶτος ( prôtos ), meaning "first", and ζῶα ( zôa ), plural of ζῶον ( zôon ), meaning "animal". [14] [15] The use of Protozoa as a formal taxon has been discouraged by some researchers, mainly because the term implies kinship with animals (Metazoa) [16] [17] and promotes an arbitrary separation of "animal-like" from "plant-like" organisms. [18]

In 1848, as a result of advancements in cell theory pioneered by Theodor Schwann and Matthias Schleiden, the anatomist and zoologist C. T. von Siebold proposed that the bodies of protozoans such as ciliates and amoebae consisted of single cells, similar to those from which the multicellular tissues of plants and animals were constructed. Von Siebold redefined Protozoa to include only such unicellular forms, to the exclusion of all metazoa (animals). [19] At the same time, he raised the group to the level of a phylum containing two broad classes of microorganisms: Infusoria (mostly ciliates and flagellated algae) and Rhizopoda (amoeboid organisms). The definition of Protozoa as a phylum or sub-kingdom composed of "unicellular animals" was adopted by the zoologist Otto Bütschli—celebrated at his centenary as the "architect of protozoology" [20] —and the term came into wide use.

As a phylum under Animalia, the Protozoa were firmly rooted in the old "two-kingdom" classification of life, according to which all living beings were classified as either animals or plants. As long as this scheme remained dominant, the protozoa were understood to be animals and studied in departments of Zoology, while photosynthetic microorganisms and microscopic fungi—the so-called Protophyta—were assigned to the Plants, and studied in departments of Botany. [21]

Criticism of this system began in the latter half of the 19th century, with the realization that many organisms met the criteria for inclusion among both plants and animals. For example, the algae Euglena and Dinobryon have chloroplasts for photosynthesis, but can also feed on organic matter and are motile. In 1860, John Hogg argued against the use of "protozoa", on the grounds that "naturalists are divided in opinion—and probably some will ever continue so—whether many of these organisms or living beings, are animals or plants." [16] As an alternative, he proposed a new kingdom called Primigenum, consisting of both the protozoa and unicellular algae (Rhodophyta), which he combined together under the name "Protoctista". In Hoggs's conception, the animal and plant kingdoms were likened to two great "pyramids" blending at their bases in the Kingdom Primigenum.

Six years later, Ernst Haeckel also proposed a third kingdom of life, which he named Protista. At first, Haeckel included a few multicellular organisms in this kingdom, but in later work, he restricted the Protista to single-celled organisms, or simple colonies whose individual cells are not differentiated into different kinds of tissues.

Despite these proposals, Protozoa emerged as the preferred taxonomic placement for heterotrophic microorganisms such as amoebae and ciliates, and remained so for more than a century. In the course of the 20th century, however, the old "two kingdom" system began to weaken, with the growing awareness that fungi did not belong among the plants, and that most of the unicellular protozoa were no more closely related to the animals than they were to the plants. By mid-century, some biologists, such as Herbert Copeland, Robert H. Whittaker and Lynn Margulis, advocated the revival of Haeckel's Protista or Hogg's Protoctista as a kingdom-level eukaryotic group, alongside Plants, Animals and Fungi. [21] A variety of multi-kingdom systems were proposed, and Kingdoms Protista and Protoctista became well established in biology texts and curricula. [22] [23] [24]

While many taxonomists have abandoned Protozoa as a high-level group, Thomas Cavalier-Smith has retained it as a kingdom in the various classifications he has proposed. As of 2015, Cavalier-Smith's Protozoa excludes several major groups of organisms traditionally placed among the protozoa, including the ciliates, dinoflagellates and foraminifera (all members of the SAR supergroup). In its current form, his kingdom Protozoa is a paraphyletic group which includes a common ancestor and most of its descendants, but excludes two important clades that branch within it: the animals and fungi. [8]

Since the protozoa, as traditionally defined, can no longer be regarded as "primitive animals" the terms "protists", "Protista" or "Protoctista" are sometimes preferred. In 2005, members of the Society of Protozoologists voted to change its name to the International Society of Protistologists. [25]

Size Edit

Protozoa, as traditionally defined, range in size from as little as 1 micrometre to several millimetres, or more. [26] Among the largest are the deep-sea–dwelling xenophyophores, single-celled foraminifera whose shells can reach 20 cm in diameter. [27]

Species Cell type Size in micrometres
Plasmodium falciparum malaria parasite, trophozoite phase [28] 1–2
Massisteria voersi free-living cercozoan amoeboid [29] 2.3–3
Bodo saltans free-living kinetoplastid flagellate [30] 5–8
Plasmodium falciparum malaria parasite, gametocyte phase [31] 7–14
Trypanosoma cruzi parasitic kinetoplastid, Chagas disease [32] 14–24
Entamoeba histolytica parasitic amoebozoan [33] 15–60
Balantidium coli parasitic ciliate [34] 50–100
Paramecium caudatum free-living ciliate [35] 120–330
Amoeba proteus free-living amoebozoan [36] 220–760
Noctiluca scintillans free-living dinoflagellate [37] 700–2000
Syringammina fragilissima foraminiferan amoeboid [27] up to 200 000

Habitat Edit

Free-living protozoans are common and often abundant in fresh, brackish and salt water, as well as other moist environments, such as soils and mosses. Some species thrive in extreme environments such as hot springs [38] and hypersaline lakes and lagoons. [39] All protozoa require a moist habitat however, some can survive for long periods of time in dry environments, by forming resting cysts that enable them to remain dormant until conditions improve.

Parasitic and symbiotic protozoa live on or within other organisms, including vertebrates and invertebrates, as well as plants and other single-celled organisms. Some are harmless or beneficial to their host organisms others may be significant causes of diseases, such as babesia, malaria and toxoplasmosis.

Association between protozoan symbionts and their host organisms can be mutually beneficial. Flagellated protozoans such as Trichonympha and Pyrsonympha inhabit the guts of termites, where they enable their insect host to digest wood by helping to break down complex sugars into smaller, more easily digested molecules. [40] A wide range of protozoans live commensally in the rumens of ruminant animals, such as cattle and sheep. These include flagellates, such as Trichomonas, and ciliated protozoa, such as Isotricha and Entodinium. [41] The ciliate subclass Astomatia is composed entirely of mouthless symbionts adapted for life in the guts of annelid worms. [42]

Feeding Edit

All protozoans are heterotrophic, deriving nutrients from other organisms, either by ingesting them whole or consuming their organic remains and waste-products. Some protozoans take in food by phagocytosis, engulfing organic particles with pseudopodia (as amoebae do), or taking in food through a specialized mouth-like aperture called a cytostome. Others take in food by osmotrophy, absorbing dissolved nutrients through their cell membranes. [ citation needed ]

Parasitic protozoans use a wide variety of feeding strategies, and some may change methods of feeding in different phases of their life cycle. For instance, the malaria parasite Plasmodium feeds by pinocytosis during its immature trophozoite stage of life (ring phase), but develops a dedicated feeding organelle (cytostome) as it matures within a host's red blood cell. [43]

Protozoa may also live as mixotrophs, supplementing a heterotrophic diet with some form of autotrophy. Some protozoa form close associations with symbiotic photosynthetic algae, which live and grow within the membranes of the larger cell and provide nutrients to the host. Others practice kleptoplasty, stealing chloroplasts from prey organisms and maintaining them within their own cell bodies as they continue to produce nutrients through photosynthesis. The ciliate Mesodinium rubrum retains functioning plastids from the cryptophyte algae on which it feeds, using them to nourish themselves by autotrophy. These, in turn, may be passed along to dinoflagellates of the genus Dinophysis, which prey on Mesodinium rubrum but keep the enslaved plastids for themselves. Within Dinophysis, these plastids can continue to function for months. [44]

Motility Edit

Organisms traditionally classified as protozoa are abundant in aqueous environments and soil, occupying a range of trophic levels. The group includes flagellates (which move with the help of whip-like structures called flagella), ciliates (which move by using hair-like structures called cilia) and amoebae (which move by the use of foot-like structures called pseudopodia). Some protozoa are sessile, and do not move at all.

Pellicle Edit

Unlike plants, fungi and most types of algae, protozoans do not typically have a rigid cell wall, but are usually enveloped by elastic structures of membranes that permit movement of the cell. In some protozoans, such as the ciliates and euglenozoans, the cell is supported by a composite membranous envelope called the "pellicle". The pellicle gives some shape to the cell, especially during locomotion. Pellicles of protozoan organisms vary from flexible and elastic to fairly rigid. In ciliates and Apicomplexa, the pellicle is supported by closely packed vesicles called alveoli. In euglenids, it is formed from protein strips arranged spirally along the length of the body. Familiar examples of protists with a pellicle are the euglenoids and the ciliate Paramecium. In some protozoa, the pellicle hosts epibiotic bacteria that adhere to the surface by their fimbriae (attachment pili). [45]

Life cycle Edit

Some protozoa have two-phase life cycles, alternating between proliferative stages (e.g., trophozoites) and dormant cysts. As cysts, protozoa can survive harsh conditions, such as exposure to extreme temperatures or harmful chemicals, or long periods without access to nutrients, water, or oxygen. Being a cyst enables parasitic species to survive outside of a host, and allows their transmission from one host to another. When protozoa are in the form of trophozoites (Greek tropho = to nourish), they actively feed. The conversion of a trophozoite to cyst form is known as encystation, while the process of transforming back into a trophozoite is known as excystation.

Protozoans reproduce asexually by binary fission or multiple fission. Many protozoan species also exchange genetic material by sexual means (typically, through conjugation), but this is generally decoupled from the process of reproduction, and does not immediately result in increased population. [46]

Although meiotic sex is widespread among present day eukaryotes, it has, until recently, been unclear whether or not eukaryotes were sexual early in their evolution. Due to recent advances in gene detection and other techniques, evidence has been found for some form of meiotic sex in an increasing number of protozoans of ancient lineage that diverged early in eukaryotic evolution. [47] (See eukaryote reproduction.) Thus, such findings suggest that meiotic sex arose early in eukaryotic evolution. Examples of protozoan meiotic sexuality are described in the articles Amoebozoa, Giardia lamblia, Leishmania, Plasmodium falciparum biology, Paramecium, Toxoplasma gondii, Trichomonas vaginalis and Trypanosoma brucei.

Historically, the Protozoa were classified as "unicellular animals", as distinct from the Protophyta, single-celled photosynthetic organisms (algae), which were considered primitive plants. Both groups were commonly given the rank of phylum, under the kingdom Protista. [48] In older systems of classification, the phylum Protozoa was commonly divided into several sub-groups, reflecting the means of locomotion. [49] Classification schemes differed, but throughout much of the 20th century the major groups of Protozoa included:

    , or Mastigophora (motile cells equipped with whiplike organelles of locomotion, e.g., Giardia lamblia) (cells that move by extending pseudopodia or lamellipodia, e.g., Entamoeba histolytica) , or Sporozoa (parasitic, spore-producing cells, whose adult form lacks organs of motility, e.g., Plasmodium knowlesi)
      (now in Alveolata) (now in Fungi) (now in Rhizaria) (now in Cnidaria)

    With the emergence of molecular phylogenetics and tools enabling researchers to directly compare the DNA of different organisms, it became evident that, of the main sub-groups of Protozoa, only the ciliates (Ciliophora) formed a natural group, or monophyletic clade (that is, a distinct lineage of organisms sharing common ancestry). The other classes or subphyla of Protozoa were all polyphyletic groups composed of organisms that, despite similarities of appearance or way of life, were not necessarily closely related to one another. In the system of eukaryote classification currently endorsed by the International Society of Protistologists, members of the old phylum Protozoa have been distributed among a variety of supergroups. [50]

    As components of the micro- and meiofauna, protozoa are an important food source for microinvertebrates. Thus, the ecological role of protozoa in the transfer of bacterial and algal production to successive trophic levels is important. As predators, they prey upon unicellular or filamentous algae, bacteria, and microfungi. Protozoan species include both herbivores and consumers in the decomposer link of the food chain. They also control bacteria populations and biomass to some extent.

    Disease Edit

    The protozoan Ophryocystis elektroscirrha is a parasite of butterfly larvae, passed from female to caterpillar. Severely infected individuals are weak, unable to expand their wings, or unable to eclose, and have shortened lifespans, but parasite levels vary in populations. Infection creates a culling effect, whereby infected migrating animals are less likely to complete the migration. This results in populations with lower parasite loads at the end of the migration. [51] This is not the case in laboratory or commercial rearing, where after a few generations, all individuals can be infected. [52]


    This work was supported by the Energy Biosciences Institute and the DOE Center for Advanced Bioenergy and Bioproducts Innovation, which is supported by the U.S. Department of Energy, Office of Science, and Office of Biological and Environmental Research under Award Number DE-SC0018420. The work conducted by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. The collection of the M. sinensis and M. sacchariflorus accessions and RAD-seq work was supported by EU FP7 KBBE.2011.3.1-02, Grant Number 289461 (GrassMargins) and the DOE Office of Science, Office of Biological and Environmental Research (BER), Grant Numbers DE-SC0006634 and DE-SC0012379. The generation of the tetraploid M. sacchariflorus whole-genome sequence data was funded by the BBSRC Core Strategic Programme in Resilient Crops: Miscanthus, award number BBS/E/W/0012843A. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the U.S. Department of Energy. DSR is grateful for support from the Chan-Zuckerberg BioHub and the Marthella Foskett Brown family. We thank Alvaro Hernandez and the University of Illinois Keck Center for Illumina RNA sequencing.

    2.6: Assignment- Biological Astronaut - Biology

    a Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Department of Chemical Biology, College of Chemistry and Molecular Engineering, Synthetic and Functional Biomolecules Center, Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
    E-mail: [email protected]

    b Roche Innovation Center Shanghai, Roche Pharma Research & Early Development, Shanghai 201203, China


    Pseudopaline is an opine carboxylate metallophore produced by Pseudomonas aeruginosa for harvesting divalent metals. However, the structure of pseudopaline is not fully elucidated. Herein, we report the first de novo total synthesis and isolation of pseudopaline, which allows unambiguous determination and confirmation of both the absolute and the relative configuration of the natural product. The synthesis highlights an efficient and stereocontrolled route using the asymmetric Tsuji–Trost reaction as the key step. The preliminary structure–activity relationship study indicated that one pseudopaline derivative shows comparable activity to pseudopaline. Moreover, a pseudopaline-fluorescein conjugate was prepared and evaluated, which confirmed that pseudopaline could be transported in the bacteria. Since the metal acquisition by P. aeruginosa is crucial for its ability to cause diseases, our extensive structural and functional studies of pseudopaline may pave the way for developing new therapeutic strategies such as the “Trojan horse” antibiotic conjugate against P. aeruginosa.


    Genetics Edit

    A 2008 study compared 112 male-to-female transsexuals (MtFs), both androphilic and gynephilic, and who were mostly already undergoing hormone treatment, with 258 cisgender male controls. Male-to-female transsexuals were more likely than cisgender males to have a longer version of a receptor gene (longer repetitions of the gene) for the sex hormone androgen, which reduced its effectiveness at binding testosterone. [5] The androgen receptor (NR3C4) is activated by the binding of testosterone or dihydrotestosterone, where it plays a critical role in the forming of primary and secondary male sex characteristics. The research suggests reduced androgen and androgen signaling contributes to the female gender identity of male-to-female transsexuals. The authors say that a decrease in testosterone levels in the brain during development might prevent complete masculinization of the brain in male-to-female transsexuals and thereby cause a more feminized brain and a female gender identity. [5] [6]

    A variant genotype for a gene called CYP17, which acts on the sex hormones pregnenolone and progesterone, has been found to be linked to female-to-male (FtMs) transsexuality but not MtF transsexuality. Most notably, the FtM subjects not only had the variant genotype more frequently, but had an allele distribution equivalent to male controls, unlike the female controls. The paper concluded that the loss of a female-specific CYP17 T -34C allele distribution pattern is associated with FtM transsexuality. [7]

    Transsexuality among twins Edit

    In 2013, a twin study combined a survey of pairs of twins where one or both had undergone, or had plans and medical approval to undergo, gender transition, with a literature review of published reports of transgender twins. The study found that one third of identical twin pairs in the sample were both transgender: 13 of 39 (33%) monozygotic or identical pairs of assigned males and 8 of 35 (22.8%) pairs of assigned females. Among dizygotic or genetically non-identical twin pairs, there was only 1 of 38 (2.6%) pairs where both twins were trans. [4] The significant percent of identical twin pairs in which both twins are trans and the virtual absence of dizygotic twins (raised in the same family at the same time) in which both were trans would provide evidence that transgender identity is significantly influenced by genetics if both sets were raised in different families. [4]

    Brain structure Edit

    General Edit

    Several studies have found a correlation between gender identity and brain structure. [8] A first-of-its-kind study by Zhou et al. (1995) found that in a region of the brain called the bed nucleus of the stria terminalis (BSTc), a region which is known for sex and anxiety responses (and which is affected by prenatal androgens), [9] cadavers of six persons who were described as having been male-to-female transsexual or transgender persons in life had female-normal BSTc size, similar to the study's cadavers of cisgender women. While those identified as transsexual had taken hormones, this was accounted for by including cadavers of non-transsexual male and female controls who, for a variety of medical reasons, had experienced hormone reversal. The controls still had sizes typical for their gender. No relationship to sexual orientation was found. [10]

    In a follow-up study, Kruijver et al. (2000) looked at the number of neurons in BSTc instead of volumes. They found the same results as Zhou et al. (1995), but with even more dramatic differences. One MtF subject, who had never gone on hormones, was also included and matched up with the female neuron counts nonetheless. [11]

    In 2002, a follow-up study by Chung et al. found that significant sexual dimorphism (variation between sexes) in BSTc did not become established until adulthood. Chung et al. theorized that either changes in fetal hormone levels produce changes in BSTc synaptic density, neuronal activity, or neurochemical content which later lead to size and neuron count changes in BSTc, or that the size of BSTc is affected by the generation of a gender identity inconsistent with one's assigned sex. [12]

    It has been suggested that the BSTc differences may be due to the effects of hormone replacement therapy. It has also been suggested that because pedophilic offenders have also been found to have a reduced BSTc, a feminine BSTc may be a marker for paraphilias rather than transsexuality. [2]

    In a review of the evidence in 2006, Gooren considered the earlier research as supporting the concept of transsexuality as a sexual differentiation disorder of the sex dimorphic brain. [13] Dick Swaab (2004) concurs. [14]

    In 2008, a new region with properties similar to that of BSTc in regards to transsexuality was found by Garcia-Falgueras and Swaab: the interstitial nucleus of the anterior hypothalamus (INAH3), part of the hypothalamic uncinate nucleus. The same method of controlling for hormone usage was used as in Zhou et al. (1995) and Kruijver et al. (2000). The differences were even more pronounced than with BSTc control males averaged 1.9 times the volume and 2.3 times the neurons as control females, yet regardless of hormone exposure, MtF transsexuals were within the female range and the FtM transsexual within the male range. [15]

    A 2009 MRI study by Luders et al. of 24 MtF transsexuals not yet treated with cross-sex hormones found that regional gray matter concentrations were more similar to those of cisgender men than to those of cisgender women, but there was a significantly larger volume of gray matter in the right putamen compared to cisgender men. Like earlier studies, it concluded that transsexuality was associated with a distinct cerebral pattern. [16] (MRI allows easier study of larger brain structures, but independent nuclei are not visible due to lack of contrast between different neurological tissue types, hence other studies on e.g. BSTc were done by dissecting brains post-mortem.)

    An additional feature was studied comparing 18 female-to-male transsexuals who had not yet received cross-sex hormones with 24 cisgender male and 19 female gynephilic controls, using an MRI technique called diffusion tensor imaging or DTI. [17] DTI is a specialized technique for visualizing white matter of the brain, and white matter structure is one of the differences in neuroanatomy between men and women. The study took into account fractional anisotropy values for white matter in the medial and posterior parts of the right superior longitudinal fasciculus (SLF), the forceps minor, and the corticospinal tract. Rametti et al. (2010) discovered that, "Compared to control females, FtM showed higher FA values in posterior part of the right SLF, the forceps minor and corticospinal tract. Compared to control males, FtM showed only lower FA values in the corticospinal tract." [17] The white matter pattern in female-to-male transsexuals was found to be shifted in the direction of biological males.

    Hulshoff Pol et al. (2006) studied the gross brain volume of 8 male-to-female transsexuals and in six female-to-male transsexuals undergoing hormone treatment. They found that hormones changed the sizes of the hypothalamus in a gender consistent manner: treatment with male hormones shifted the hypothalamus towards the male direction in the same way as in male controls, and treatment with female hormones shifted the hypothalamus towards the female direction in the same way as female controls. They concluded: "The findings suggest that, throughout life, gonadal hormones remain essential for maintaining aspects of sex-specific differences in the human brain." [18]

    A 2016 review agreed with the other reviews when considering androphilic trans women and gynephilic trans men. It reported that hormone treatment may have large effects on the brain, and that cortical thickness, which is generally thicker in cisgender women's brains than in cisgender men's brains, may also be thicker in trans women's brains, but is present in a different location to cisgender women's brains. [2] It also stated that for both trans women and trans men, "cross-sex hormone treatment affects the gross morphology as well as the white matter microstructure of the brain. Changes are to be expected when hormones reach the brain in pharmacological doses. Consequently, one cannot take hormone-treated transsexual brain patterns as evidence of the transsexual brain phenotype because the treatment alters brain morphology and obscures the pre-treatment brain pattern." [2]

    Androphilic male-to-female transsexuals Edit

    Studies have shown that androphilic male-to-female transsexuals show a shift towards the female direction in brain anatomy. In 2009, a German team of radiologists led by Gizewski compared 12 androphilic transsexuals with 12 cisgender males and 12 cisgender females. Using functional magnetic resonance imaging (fMRI), they found that when shown erotica, the cisgender men responded in several brain regions that the cisgender women did not, and that the sample of androphilic transsexuals was shifted towards the female direction in brain responses. [19]

    In another study, Rametti and colleagues used diffusion tensor imaging (DTI) to compare 18 androphilic male-to-female transsexuals with 19 gynephilic males and 19 androphilic cisgender females. The androphilic transsexuals differed from both control groups in multiple brain areas, including the superior longitudinal fasciculus, the right anterior cingulum, the right forceps minor, and the right corticospinal tract. The study authors concluded that androphilic transsexuals were halfway between the patterns exhibited by male and female controls. [20]

    A 2016 review reported that early-onset androphilic transgender women have a brain structure similar to cisgender women's and unlike cisgender men's, but that they have their own brain phenotype. [2]

    Gynephilic male-to-female transsexuals Edit

    Research on gynephilic trans women is considerably limited. [2] While MRI taken on gynephilic male-to-female transsexuals have likewise shown differences in the brain from non-transsexuals, no feminization of the brain's structure have been identified. [2] Neuroscientists Ivanka Savic and Stefan Arver at the Karolinska Institute used MRI to compare 24 gynephilic male-to-female transsexuals with 24 cisgender male and 24 cisgender female controls. None of the study participants were on hormone treatment. The researchers found sex-typical differentiation between the MtF transsexuals and cisgender males, and the cisgender females but the gynephilic transsexuals "displayed also singular features and differed from both control groups by having reduced thalamus and putamen volumes and elevated GM volumes in the right insular and inferior frontal cortex and an area covering the right angular gyrus". [21]

    The researchers concluded that:

    Contrary to the primary hypothesis, no sex-atypical features with signs of 'feminization' were detected in the transsexual group . The present study does not support the dogma that [male-to-female transsexuals] have atypical sex dimorphism in the brain but confirms the previously reported sex differences. The observed differences between MtF-TR and controls raise the question as to whether gender dysphoria may be associated with changes in multiple structures and involve a network (rather than a single nodal area). [21]

    Berglund et al. (2008) tested the response of gynephilic MtF transsexuals to two steroids hypothesized to be sex pheromones: the progestin-like 4,16-androstadien-3-one (AND) and the estrogen-like 1,3,5(10),16-tetraen-3-ol (EST). Despite the difference in sexual orientation, the MtFs' hypothalamic networks activated in response to the AND pheromone, like the androphilic female control groups. Both groups experienced amygdala activation in response to EST. Gynephilic male control groups experienced hypothalamic activation in response to EST. However, the MtF subjects also experienced limited hypothalamic activation to EST. The researchers concluded that in terms of pheromone activation, MtFs occupy an intermediate position with predominantly female features. [22] The MtF transsexual subjects had not undergone any hormonal treatment at the time of the study, according to their own declaration beforehand, and confirmed by repeated tests of hormonal levels. [22]

    A 2016 review reported that gynephilic trans women differ from both cisgender male and female controls in non-dimorphic brain areas. [2]

    Gynephilic female-to-male transsexuals Edit

    Fewer studies have been performed on the brain structure of transgender men than on transgender women. [2] A team of neuroscientists, led by Nawata in Japan, used a technique called single-photon emission computed tomography (SPECT) to compare the regional cerebral blood flow (rCBF) of 11 gynephilic FtM transsexuals with that of 9 androphilic cis females. Although the study did not include a sample of biological males so that a conclusion of "male shift" could be made, the study did reveal that the gynephilic FtM transsexuals showed significant decrease in blood flow in the left anterior cingulate cortex and a significant increase in the right insula, two brain regions known to respond during sexual arousal. [23]

    A 2016 review reported that the brain structure of early-onset gynephilic trans men generally corresponds to their assigned sex, but that they have their own phenotype with respect to cortical thickness, subcortical structures, and white matter microstructure, especially in the right hemisphere. [2] Morphological increments observed in the brains of trans men might be due to the anabolic effects of testosterone. [2]

    Prenatal androgen exposure Edit

    Prenatal androgen exposure, the lack thereof, or poor sensitivity to prenatal androgens are commonly cited mechanisms to explain the above discoveries. To test this, studies have examined the differences between transsexual and cisgender individuals in digit ratio (a generally accepted marker for prenatal androgen exposure). A meta-analysis concluded that the effect sizes for this association were small or nonexistent. [24]

    Congenital adrenal hyperplasia in persons with XX sex chromosomes results in what is considered to be excess exposure to prenatal androgens, resulting in masculinization of the genitalia and, typically, controversial prenatal hormone treatment [25] and postnatal surgical interventions. [26] Individuals with CAH are usually raised as girls and tend to have similar cognitive abilities to the typical female, including spatial ability, verbal ability, language lateralization, handedness and aggression. Research has shown that people with CAH and XX chromosomes will be more likely to be same sex attracted, [25] and at least 5.2% of these individuals develop serious gender dysphoria. [27]

    In males with 5-alpha-reductase deficiency, conversion of testosterone to dihydrotestosterone is disrupted, decreasing the masculinization of genitalia. Individuals with this condition are typically raised as females due to their feminine appearance at a young age. However, more than half of males with this condition raised as females become males later in their life. Scientists speculate that the definition of masculine characteristics during puberty and the increased social status afforded to men are two possible motivations for a female-to-male transition. [27]

    Psychiatrist and sexologist David Oliver Cauldwell [28] argued in 1947 that transsexuality was caused by multiple factors. He believed that small boys tend to admire their mothers to such a degree that they end up wanting to be like them. However, he believed that boys would lose this desire as long as his parents set limits when raising him, or he had the right genetic predispositions or a normal sexuality. In 1966, Harry Benjamin [29] considered the causes of transsexuality to be badly understood, and argued that researchers were biased towards considering psychological causes over biological causes.

    Ray Blanchard has developed a taxonomy of male-to-female transsexualism [30] built upon the work of his colleague Kurt Freund, [31] which assumes that trans women have one of two primary causes for gender dysphoria. [32] [33] [34] Blanchard theorizes that "homosexual transsexuals" (a taxonomic category he uses to refer to trans women who are sexually attracted to men) are attracted to men and develop gender dysphoria typically during childhood, and characterizes them as displaying overt and obvious femininity since childhood he characterizes "non-homosexual transsexuals" (a taxonomic category he uses to refer to trans women who are sexually attracted to women) as developing gender dysphoria primarily because they are autogynephilic (sexually aroused by the thought or image of themselves as a woman [30] ), and as being either attracted to women, attracted to both women and men (a concept he calls pseudo-bisexuality as attraction to males is part of the performance of an autogynephilic sexual fantasy), or asexual.

    Autogynephilia is common among late-onset transgender women. [35] A study on autogynephilic men found that they were more gender dysphoric than non-autogynephilic men. [36] Michael Bailey speculated that autogynephilia may be genetic. [32]

    Blanchard's theory has gained support from J. Michael Bailey, Anne Lawrence, James Cantor, and others who argue that there are significant differences between the two groups, including sexuality, age of transition, ethnicity, IQ, fetishism, and quality of adjustment. [37] [38] [30] [39] [32] However, the theory has been criticized in papers from Veale, Nuttbrock, Moser, and others who argue that it is poorly representative of MtF transsexuals and non-instructive, and that the experiments behind it are poorly controlled and/or contradicted by other data. [40] [41] [42] [43] Many authorities, including some supporters of the theory, criticize Blanchard's choice of wording as confusing or degrading because it focuses on trans women's assigned sex and disregards their sexual orientation identity. [2] Lynn Conway, Andrea James, and Deidre McClosky attacked Bailey's reputation following the release of The Man Who Would Be Queen. [44] Evolutionary biologist and trans woman Julia Serano wrote that "Blanchard's controversial theory is built upon a number of incorrect and unfounded assumptions, and there are many methodological flaws in the data he offers to support it." [45] The World Professional Association for Transgender Health (WPATH) argued against including Blanchard's typology in the DSM, stating that there was no scientific consensus on the theory, and that there was a lack of longitudinal studies on the development of transvestic fetishism. [46]

    A 2016 review found support for the predictions of Blanchard's typology that androphilic and gynephilic trans women have different brain phenotypes. It stated that although Cantor seems to be right that Blanchard's predictions have been validated by two independent structural neuroimaging studies, there is "still only one study on nonhomosexual MtFs to fully confirm the hypothesis, more independent studies on nonhomosexual MtFs are needed. A much better verification of the hypothesis could be supplied by a specifically designed study including homosexual and nonhomosexual MtFs." The review stated that "confirming Blanchard's prediction still needs a specifically designed comparison of homosexual MtF, homosexual male, and heterosexual male and female people." [2]

    The failure of an attempt to raise David Reimer from infancy through adolescence as a girl after his genitals were accidentally mutilated is cited as disproving the theory that gender identity is determined solely by parenting. [47] [48] Between the 1960s and 2000, many other newborn and infant boys were surgically reassigned as females if they were born with malformed penises, or if they lost their penises in accidents. Many surgeons believed such males would be happier being socially and surgically reassigned female. Available evidence indicates that in such instances, parents were deeply committed to raising these children as girls and in as gender-typical a manner as possible. Six of seven cases providing orientation in adult follow-up studies identified as heterosexual males, with one retaining a female identity, but who is attracted to women. Such cases do not support the theory that parenting influences gender identity or sexual orientation of those assigned male at birth. [49] : 72–73 Reimer's case is used by organizations such as the Intersex Society of North America to caution against needlessly modifying the genitals of unconsenting minors. [50]

    In 2015, the American Academy of Pediatrics released a webinar series on gender, gender identity, gender expression, transgender, etc. [51] [52] In the first lecture Dr. Sherer explains that parents' influence (through punishment and reward of behavior) can influence gender expression but not gender identity. [53] She cites a Smithsonian article that shows a photo of a 3 year old President Franklin D. Roosevelt with long hair, wearing a dress. [54] [52] Children as old as 6 wore gender neutral clothing, consisting of white dresses, until the 1940s. [54] In 1927, Time magazine printed a chart showing sex-appropriate colors, which consisted of pink for boys and blue for girls. [54] Dr. Sherer argued that kids will modify their gender expression to seek reward from their parents and society but this will not affect their gender identity (their internal sense of self). [53]

    Watch the video: Australian Virtual Astronaut Challenge - Week 1 (August 2022).