Following includes codes used to do shape analysis of duck bones and then run phyloacc-c
We combined surface scans and CT/μCT scans to create a database of skeletal elements for three hindlimb bones (femur, tibiotarsus, and tarsometatarsus) along with the four forelimb bone (humerus, radius, ulna, and carpometacarpus) for 65 species of ducks. Our sampling for the species and the bone scans for those species are presented in the following table.
| Images | Scientific_Name | Common_Name | Voucher_Info | Genomes_NCBI | Femur | Tibiotarsus | Tarsometatarsus | Humerus | Radius | Ulna | Carpometacarpus |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
Aix galericulata | Mandarin Duck | USNM 622444 | GCA_024635365.1 | X | X | X | X | X | X | X |
|
|
Aix sponsa | Wood Duck | MCZ:Orn:347504 | GCA_032270905.1 | X | X | X | X | X | X | X |
|
|
Amazonetta brasiliensis | Brazilian Teal | USNM 631047 | X | X | X | X | X | X | X | |
|
|
Anas acuta | Northern Pintail | MCZ:Orn:341895 | GCA_963932015.1 | X | X | X | ||||
|
|
Anas aucklandica | Auckland Islands Teal | UMZC:Vertebrates:12/Ana/42/a/8-10 | X | X | X | X | X | X | X | |
|
|
Anas crecca | Green-winged Teal | MCZ:Orn:347036 | X | X | X | X | X | |||
|
|
Anas erythrorhyncha | Red-billed Duck | NHMUK:Zoo:S/1952.1.159 | X | X | X | X | ||||
|
|
Anas laysanensis | Laysan Duck | MCZ:Orn:347544 | X | X | X | |||||
|
|
Anas platyrhynchos | Mallard | MCZ:Orn:347156 | GCA900411745.1 | X | X | X | X | X | X | X |
|
|
Anas rubripes | American Black Duck | MCZ:Orn:337845 | X | X | X | X | X | X | ||
|
|
Anas zonorhyncha | Eastern Spot-billed Duck | USNM 633481 | GCA_002224875.1 | X | X | X | X | X | X | X |
|
|
Anhima cornuta | Horned Screamer | MCZ:Orn:346993 | GCA_013399475.1 | X | X | X | X | X | X | |
|
|
Anser albifrons | Greater White-fronted Goose | NHMUK:Zoo:1895.2.6.8 | X | X | X | X | X | X | X | |
|
|
Anser caerulescens | Snow Goose | MCZ:Orn:341885 | X | X | X | X | X | X | ||
|
|
Anser cygnoides | Swan Goose | USNM 19637 | GCA_026259575.1 | X | X | X | X | X | ||
|
|
Anser erythropus | Lesser White-fronted Goose | MCZ:Orn:340330 | X | X | X | |||||
|
|
Anser fabalis | Taiga Bean-Goose | MCZ:Orn:340262 | GCA_002592135.1 | X | X | X | X | X | ||
|
|
Anser indicus | Bar-headed Goose | USNM 557494 | GCA_025583725.1 | X | X | X | X | X | X | |
|
|
Anseranas semipalmata | Magpie Goose | NHMUK:Zoo:1852.7.22.1 | GCA_013399115.1 | X | X | X | X | X | ||
|
|
Asarcornis scutulata | White-winged Duck | USNM 612610 | GCA_013398475.1 | X | X | X | X | X | X | X |
|
|
Aythya americana | Redhead | MCZ:Orn:346918 | X | X | X | X | X | X | ||
|
|
Aythya fuligula | Tufted Duck | USNM 641850 | GCA_009819795.1 | X | X | X | X | X | X | X |
|
|
Branta bernicla | Brant | MCZ:Orn:336993 | X | X | X | X | X | X | X | |
|
|
Branta canadensis | Canada Goose | MCZ:Orn:346738 | GCA_006130075.1 | X | X | X | X | X | X | X |
|
|
Branta leucopsis | Barnacle Goose | MCZ:Orn:346732 | X | X | X | X | X | X | X | |
|
|
Bucephala clangula | Common Goldeneye | MCZ:Orn:347155 | X | X | X | X | X | X | X | |
|
|
Cairina moschata | Muscovy Duck | MCZ:Orn:348425 | GCA_018104995.1 | X | X | X | X | X | X | |
|
|
Cereopsis novaehollandiae | Cape Barren Goose | MCZ:Orn:347095 | X | X | X | X | X | X | X | |
|
|
Chauna chavaria | Northern Screamer | MCZ:Orn:340307 | X | X | X | X | ||||
|
|
Chenonetta jubata | Maned Duck | MCZ:Orn:339522 | X | X | X | |||||
|
|
Chloephaga hybrida | Kelp Goose | MCZ:Orn:340225 | X | X | X | X | X | X | ||
|
|
Chloephaga picta | Upland Goose | MCZ:Orn:342847 | X | X | X | X | ||||
|
|
Chloephaga poliocephala | Ashy-headed Goose | NHMUK:Zoo:S/1963.10.1 | X | X | X | X | X | X | ||
|
|
Chloephaga rubidiceps | Ruddy-headed Goose | MCZ:Orn:343209 | X | X | X | X | X | X | ||
|
|
Clangula hyemalis | Long-tailed Duck | MCZ:Orn:346497 | GCA_029619115.1 | X | X | X | X | X | ||
|
|
Cygnus columbianus | Tundra Swan | MCZ:Orn:343048 | X | X | X | X | ||||
|
|
Cygnus olor | Mute Swan | NHMUK:Zoo:1854.6.25.2 | GCA_009769625.2 | X | X | X | X | X | X | X |
|
|
Dendrocygna autumnalis | Black-bellied Whistling-Duck | MCZ:Orn:335880 | X | X | X | X | X | X | X | |
|
|
Dendrocygna bicolor | Fulvous Whistling-Duck | MCZ:Orn:347071 | X | X | X | X | X | X | X | |
|
|
Heteronetta atricapilla | Black-headed Duck | NHMUK:Zoo:S/2002.47.6 | GCA_011075105.1 | X | X | X | X | X | X | |
|
|
Histrionicus histrionicus | Harlequin Duck | NHMUK:Zoo:1843.6.23.23 | X | X | X | X | X | X | ||
|
|
Lophodytes cucullatus | Hooded Merganser | MCZ:Orn:347140 | X | X | X | X | X | X | ||
|
|
Lophonetta specularioides | Crested Duck | USNM 490992 | GCA_032165235.1 | X | X | X | X | X | X | X |
|
|
Malacorhynchus membranaceus | Pink-eared Duck | USNM 553596 | X | X | X | X | X | X | X | |
|
|
Mareca strepera | Gadwall | MCZ:Orn:341872 | GCA_035583915.1 | X | X | X | X | X | ||
|
|
Mareca sibilatrix | Chiloe Wigeon | NHMUK:Zoo:S/1985.72.2 | GCA_034783215.1 | X | X | X | X | X | X | X |
|
|
Marmaronetta angustirostris | Marbled Teal | USNM 500299 | X | X | X | X | X | X | X | |
|
|
Melanitta fusca | Velvet Scoter | MCZ:Orn:348703 | GCA_029620185.1 | X | X | X | X | X | ||
|
|
Melanitta perspicillata | Surf Scoter | MCZ:Orn:347151 | X | X | X | X | X | |||
|
|
Merganetta armata | Torrent Duck | MCZ:Orn:345094 | X | X | X | X | X | X | X | |
|
|
Mergellus albellus | Smew | MCZ:Orn:348520 | X | X | X | X | X | X | X | |
|
|
Mergus merganser | Common Merganser | MCZ:Orn:340318 | X | X | ||||||
|
|
Mergus serrator | Red-breasted Merganser | MCZ:Orn:343636 | X | X | X | X | X | X | X | |
|
|
Nettapus auritus | African Pygmy-Goose | MCZ:Orn:341374 | GCA_011076525.1 | X | X | X | ||||
|
|
Oxyura jamaicensis | Ruddy Duck | NHMUK:Zoo:S/1985.70.4 | GCA_011077185.1 | X | X | X | X | X | X | |
|
|
Oxyura ferruginea | Andean Duck | MCZ:Orn:341437 | X | X | X | X | X | |||
|
|
Sarkidiornis melanotos | Knob-billed Duck | USNM 488140 | X | X | X | X | X | X | X | |
|
|
Somateria mollissima | Common Eider | MCZ:Orn:336958 | GCA_951411735.1 | X | X | X | X | X | X | |
|
|
Spatula clypeata | Northern Shoveler | MCZ:Orn:347105 | X | X | X | X | X | X | ||
|
|
Spatula discors | Blue-winged Teal | NHMUK:Zoo:S/2001.41.5 | GCA_032165275.1 | X | X | X | X | X | X | X |
|
|
Stictonetta naevosa | Freckled Duck | USNM 555594 | GCA_011074415.1 | X | X | X | X | X | X | X |
|
|
Tachyeres brachypterus | Falkland Steamer-Duck | MCZ:Orn:342206 | GCA_032357605.1 | X | X | X | X | X | X | X |
|
|
Tachyeres patachonicus | Flying Steamer-Duck | USNM 491013 | X | X | X | X | X | X | X | |
|
|
Tachyeres pteneres | Flightless Steamer-Duck | USNM 490930 | X | X | X | X | X | X | X | |
|
|
Tadorna tadorna | Common Shelduck | MCZ:Orn:347538 | X | X | X | X | X | X | X |
After all the scans were collected, all scans were roughly aligned at the same coordinate space using the Align function in Meshmixer. For CT scans, we removed any internal structure by using the Select function to select the outer surface (setting the Crease Angle Thres option to >70 that prevents the selection from moving to the internal surface through holes) and using the Invert function to switch to internal surface, which were then deleted. We used Make Solid function to close any holes on the outside mesh, with a comparable Solid Accuracy and Mesh Density to avoid over smoothing of scans. Finally we exported this scan as a .obj file.
For landmarking, we used the pseudolandmarking (Rolfe, Davis, and Maga 2021) approach implemented through PseudoLMGenerator function of SlicerMorph (Rolfe et al. 2021) in Slicer. We used the Mallard Duck (Anas platyrhynchos) for the reference specimen to generate the primary landmark for all bones except the tibiotarsus (where the mallard specimen lacks the fibula) where we used the closely related Spot-billed Duck (Anas zonorhyncha). Spacing tolerance for each bone was adjusted to apply ~400 pseudolandmarks per bone.
| Bone | Spatial Tolerance | Number of Pseudolandmarks | Point Density Adjustment |
|---|---|---|---|
| Femur | 2.7 | 459 | 1.7 |
| Tibiotarsus | 2.0 | 480 | 1.9 |
| Tarsometatarsus | 2.8 | 431 | 1.7 |
| Humerus | 2.4 | 458 | 1.9 |
| Radius | 1.8 | 432 | 2.6 |
| Ulna | 2.2 | 452 | 2.0 |
| Carpometacarpus | 2.6 | 432 | 2.6 |
To transfer pseudolanmarks from one species to another, we used the single-template ALPACA framework (Porto, Rolfe, and Maga 2021). We adjusted the point density adjustment to create meshes of ~4000 points as recommended for ALPACA (values in table above). We used a BCPD acceleration for processing the samples. We used a batch processing to generate landmarks for bones of all the species.
We used the phylogeny from () for this study. For a few taxa missing in the tree that we had scans for, we appended a branch next to the known closest relative species.
duck_data <- xlsx::read.xlsx("Ducks_MCZ.xlsx", sheetIndex = 1, header = T)
duck_tree <- read.tree("duck_tree.tre")
# Root at the Anhimidae
duck_tree <- root(duck_tree, outgroup = c("Anhima_cornuta", "Chauna_chavaria", "Chauna_torquata"))
# Add missing species
duck_tree <- bind.tip(duck_tree, "Anas_aucklandica", where = which(duck_tree$tip.label=="Anas_chlorotis"), edge.length = 0.0143477498/4, position = 0.0143477498/4)
duck_tree <- bind.tip(duck_tree, "Anas_zonorhyncha", where = which(duck_tree$tip.label=="Anas_platyrhynchos"), edge.length = 0.0022507147/4, position = 0.0022507147/4)
duck_tree <- bind.tip(duck_tree, "Anser_erythropus", where = which(duck_tree$tip.label=="Anser_albifrons"), edge.length = 0.0017958047/4, position = 0.0017958047/4)
duck_tree <- bind.tip(duck_tree, "Asarcornis_scutulata", where = which(duck_tree$tip.label=="Cairina_moschata"), edge.length = 0.0067085060/4, position = 0.0067085060/4)
duck_tree$tip.label[which(duck_tree$tip.label == "Malacorhyncus_membranaceus")] <- "Malacorhynchus_membranaceus"
duck_tree <- bind.tip(duck_tree, "Mergus_merganser", where = which(duck_tree$tip.label=="Mergus_serrator"), edge.length = 0.0027206995/4, position = 0.0027206995/4)
duck_tree <- bind.tip(duck_tree, "Oxyura_ferruginea", where = which(duck_tree$tip.label=="Oxyura_jamaicensis"), edge.length = 0.0024525321/4, position = 0.0024525321/4)
duck_tree <- bind.tip(duck_tree, "Spatula_discors", where = which(duck_tree$tip.label=="Spatula_cyanoptera"), edge.length = 0.0027173372/4, position = 0.0027173372/4)
duck_tree <- bind.tip(duck_tree, "Anas_crecca", where = which(duck_tree$tip.label=="Anas_flavirostris"), edge.length = 0.0032631134/4, position = 0.0032631134/4)
duck_tree_orig <- duck_tree
To conduct a Generalized Procrustes Analysis we used the gpagen function of the geommorph package. We first summarized the information for all the specimens of a single bone using the GPA function in SlicerMorph to allow for easier upload into R work environment. For pca we used gm.prcomp with phylogeny from above.
# Load summarized data for all landmark files
logfile <- parser("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Femur_landmarks/gpa/2024-01-10_12_03_54/analysis.log", forceLPS = T)
# Run GPA
femur.gpa <- gpagen(logfile$LM, print.progress = F)
# Subset phylogenetic tree to keep only taxa for which scans exist
duck_tree <- keep.tip(duck_tree_orig, duck_data$Scientific_Name[duck_data$Femur %in% "X"])
dimnames(femur.gpa$coords)[[3]] <- unlist(lapply(str_split(dimnames(femur.gpa$coords)[[3]], "_"), FUN = function(x){paste(x[1], x[2], sep="_")}))
pca <- gm.prcomp(femur.gpa$coords, phy = duck_tree)
# pc1 (25.87%), pc2 (19.36%), pc3 (7.10%), pc4 (5.34%), pc5 (5.01%)
asp_ratio <- 1.618
duck_pca <- data.frame(species=names(pca$x[,1]), PC1=pca$x[,1], PC2=pca$x[,2], PC3=pca$x[,3])
duck_pca <- merge(duck_pca, duck_data[,c("Scientific_Name", "Images", "Femur")], by.x = "species", by.y = "Scientific_Name", all.x = T, all.y = F)
g1 <- ggplot(duck_pca, aes(PC1, PC2)) +
geom_image(aes(image=Images), size=0.075, by = "width", asp = asp_ratio)+
theme_classic()+
theme(aspect.ratio = 1/asp_ratio)
g1
We also generated warped meshes to visualize bone structure relevant to the different PC axes.
ref <- mshape(femur.gpa$coords)
ref_obj <- read.ply("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_femur.ply", ShowSpecimen = F)
ref_lnd <- readland.fcsv("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_femur_new.fcsv")
warped_ref <- warpRefMesh(ref_obj, ref_lnd, ref, color = "ivory", centered = FALSE)
g2 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$min, mesh=warped_ref, method = "surface")
g3 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$max, mesh=warped_ref, method = "surface")
g4 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$min, mesh=warped_ref, method = "surface")
g5 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$max, mesh=warped_ref, method = "surface")
mesh2ply(warped_ref, "C:/Subir/Projects/ducks/pc_mesh/femur_ref")
mesh2ply(g2, "C:/Subir/Projects/ducks/pc_mesh/femur_pc1_min")
mesh2ply(g3, "C:/Subir/Projects/ducks/pc_mesh/femur_pc1_max")
mesh2ply(g4, "C:/Subir/Projects/ducks/pc_mesh/femur_pc2_min")
mesh2ply(g5, "C:/Subir/Projects/ducks/pc_mesh/femur_pc2_max")
logfile <- parser("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Tibiotarsus_landmarks/gpa/2024-03-15_15_13_05/analysis.log", forceLPS = T)
tibiotarsus.gpa <- gpagen(logfile$LM, print.progress = F)
duck_tree <- keep.tip(duck_tree_orig, duck_data$Scientific_Name[duck_data$Tibiotarsus %in% "X"])
dimnames(tibiotarsus.gpa$coords)[[3]] <- unlist(lapply(str_split(dimnames(tibiotarsus.gpa$coords)[[3]], "_"), FUN = function(x){paste(x[1], x[2], sep="_")}))
pca <- gm.prcomp(tibiotarsus.gpa$coords, phy = duck_tree)
# pc1 (29.38%), pc2 (11.31%), pc3 (10.16%), pc4 (7.18%), pc5 (4.51%)
asp_ratio <- 1.618
duck_pca <- data.frame(species=names(pca$x[,1]), PC1=pca$x[,1], PC2=pca$x[,2], PC3=pca$x[,3], PC4=pca$x[,4], PC5=pca$x[,5])
duck_pca <- merge(duck_pca, duck_data[,c("Scientific_Name", "Images", "Tibiotarsus")], by.x = "species", by.y = "Scientific_Name", all.x = T, all.y = F)
g1 <- ggplot(duck_pca, aes(PC1, PC3)) +
geom_image(aes(image=Images), size=0.075, by = "width", asp = asp_ratio)+
theme_classic()+
theme(aspect.ratio = 1/asp_ratio)
g1
ref <- mshape(tibiotarsus.gpa$coords)
ref_obj <- read.ply("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_zonorhyncha_USNM633481_tibiotarsus.ply", ShowSpecimen = F)
ref_lnd <- readland.fcsv("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_zonorhyncha_USNM633481_tibiotarsus.fcsv")
warped_ref <- warpRefMesh(ref_obj, ref_lnd, ref, color = "ivory", centered = FALSE)
g2 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$min, mesh=warped_ref, method = "surface")
g3 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$max, mesh=warped_ref, method = "surface")
g4 <- plotRefToTarget(ref, pca$shapes$shapes.comp3$min, mesh=warped_ref, method = "surface")
g5 <- plotRefToTarget(ref, pca$shapes$shapes.comp3$max, mesh=warped_ref, method = "surface")
mesh2ply(warped_ref, "C:/Subir/Projects/ducks/pc_mesh/tibiotarsus_ref")
mesh2ply(g2, "C:/Subir/Projects/ducks/pc_mesh/tibiotarsus_pc1_min")
mesh2ply(g3, "C:/Subir/Projects/ducks/pc_mesh/tibiotarsus_pc1_max")
mesh2ply(g4, "C:/Subir/Projects/ducks/pc_mesh/tibiotarsus_pc3_min")
mesh2ply(g5, "C:/Subir/Projects/ducks/pc_mesh/tibiotarsus_pc3_max")
logfile <- parser("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Tarsometatarsus_landmarks/gpa/2024-03-13_10_38_11/analysis.log", forceLPS = T)
tarsometatarsus.gpa <- gpagen(logfile$LM, print.progress = F)
duck_tree <- keep.tip(duck_tree_orig, duck_data$Scientific_Name[duck_data$Tarsometatarsus %in% "X"])
dimnames(tarsometatarsus.gpa$coords)[[3]] <- unlist(lapply(str_split(dimnames(tarsometatarsus.gpa$coords)[[3]], "_"), FUN = function(x){paste(x[1], x[2], sep="_")}))
pca <- gm.prcomp(tarsometatarsus.gpa$coords, phy = duck_tree)
# pc1 (40.49%), pc2 (10.91%), pc3 (7.74%), pc4 (5.44%), pc5 (4.45%)
asp_ratio <- 1.618
duck_pca <- data.frame(species=names(pca$x[,1]), PC1=pca$x[,1], PC2=pca$x[,2], PC3=pca$x[,3], PC4=pca$x[,4], PC5=pca$x[,5])
duck_pca <- merge(duck_pca, duck_data[,c("Scientific_Name", "Images", "Tarsometatarsus")], by.x = "species", by.y = "Scientific_Name", all.x = T, all.y = F)
g1 <- ggplot(duck_pca, aes(PC1, PC2)) +
geom_image(aes(image=Images), size=0.075, by = "width", asp = asp_ratio)+
theme_classic()+
theme(aspect.ratio = 1/asp_ratio)
g1
ref <- mshape(tarsometatarsus.gpa$coords)
ref_obj <- read.ply("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_tarsometatarsus.ply", ShowSpecimen = F)
ref_lnd <- readland.fcsv("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_tarsometatarsus.fcsv")
warped_ref <- warpRefMesh(ref_obj, ref_lnd, ref, color = "ivory", centered = FALSE)
g2 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$min, mesh=warped_ref, method = "surface")
g3 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$max, mesh=warped_ref, method = "surface")
g4 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$min, mesh=warped_ref, method = "surface")
g5 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$max, mesh=warped_ref, method = "surface")
mesh2ply(warped_ref, "C:/Subir/Projects/ducks/pc_mesh/tarsometatarsus_ref")
mesh2ply(g2, "C:/Subir/Projects/ducks/pc_mesh/tarsometatarsus_pc1_min")
mesh2ply(g3, "C:/Subir/Projects/ducks/pc_mesh/tarsometatarsus_pc1_max")
mesh2ply(g4, "C:/Subir/Projects/ducks/pc_mesh/tarsometatarsus_pc2_min")
mesh2ply(g5, "C:/Subir/Projects/ducks/pc_mesh/tarsometatarsus_pc2_max")
logfile <- parser("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Humerus_landmarks/gpa/2024-03-06_15_23_54/analysis.log", forceLPS = T)
humerus.gpa <- gpagen(logfile$LM, print.progress = F)
duck_tree <- keep.tip(duck_tree_orig, duck_data$Scientific_Name[duck_data$Humerus %in% "X"])
duck_tree <- root(duck_tree, outgroup = c("Anhima_cornuta", "Chauna_chavaria"))
dimnames(humerus.gpa$coords)[[3]] <- unlist(lapply(str_split(dimnames(humerus.gpa$coords)[[3]], "_"), FUN = function(x){paste(x[1], x[2], sep="_")}))
pca <- gm.prcomp(humerus.gpa$coords, phy = duck_tree)
# pc1 (24.13%), pc2 (12.16%), pc3 (9.49%), pc4 (8.66%), pc5 (6.59%)
asp_ratio <- 1.618
duck_pca <- data.frame(species=names(pca$x[,1]), PC1=pca$x[,1], PC2=pca$x[,2], PC3=pca$x[,3])
duck_pca <- merge(duck_pca, duck_data[,c("Scientific_Name", "Images", "Humerus")], by.x = "species", by.y = "Scientific_Name", all.x = T, all.y = F)
g1 <- ggplot(duck_pca, aes(PC1, PC2)) +
geom_image(aes(image=Images), size=0.075, by = "width", asp = asp_ratio)+
theme_classic()+
theme(aspect.ratio = 1/asp_ratio)
g1
ref <- mshape(humerus.gpa$coords)
ref_obj <- read.ply("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_humerus.ply", ShowSpecimen = F)
ref_lnd <- readland.fcsv("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_humerus_new.fcsv")
warped_ref <- warpRefMesh(ref_obj, ref_lnd, ref, color = "ivory", centered = FALSE)
g2 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$min, mesh=warped_ref, method = "surface")
g3 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$max, mesh=warped_ref, method = "surface")
g4 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$min, mesh=warped_ref, method = "surface")
g5 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$max, mesh=warped_ref, method = "surface")
mesh2ply(warped_ref, "C:/Subir/Projects/ducks/pc_mesh/humerus_ref")
mesh2ply(g2, "C:/Subir/Projects/ducks/pc_mesh/humerus_pc1_min")
mesh2ply(g3, "C:/Subir/Projects/ducks/pc_mesh/humerus_pc1_max")
mesh2ply(g4, "C:/Subir/Projects/ducks/pc_mesh/humerus_pc2_min")
mesh2ply(g5, "C:/Subir/Projects/ducks/pc_mesh/humerus_pc2_max")
logfile <- parser("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/radius_landmarks/gpa/2024-03-21_10_47_30/analysis.log", forceLPS = T)
radius.gpa <- gpagen(logfile$LM, print.progress = T)
##
## Performing GPA
## | | | 0% | |================== | 25% | |=================================== | 50% | |======================================================================| 100%
##
## Making projections... Finished!
duck_tree <- keep.tip(duck_tree_orig, duck_data$Scientific_Name[duck_data$Radius %in% "X"])
dimnames(radius.gpa$coords)[[3]] <- unlist(lapply(str_split(dimnames(radius.gpa$coords)[[3]], "_"), FUN = function(x){paste(x[1], x[2], sep="_")}))
pca <- gm.prcomp(radius.gpa$coords, phy = duck_tree)
# pc1 (28.75%), pc2 (18.06%), pc3 (10.83%), pc4 (9.17%), pc5 (5.86%)
asp_ratio <- 1.618
duck_pca <- data.frame(species=names(pca$x[,1]), PC1=pca$x[,1], PC2=pca$x[,2], PC3=pca$x[,3], PC4=pca$x[,4], PC5=pca$x[,5])
duck_pca <- merge(duck_pca, duck_data[,c("Scientific_Name", "Images", "Radius")], by.x = "species", by.y = "Scientific_Name", all.x = T, all.y = F)
g1 <- ggplot(duck_pca, aes(PC1, PC2)) +
geom_image(aes(image=Images), size=0.075, by = "width", asp = asp_ratio)+
theme_classic()+
theme(aspect.ratio = 1/asp_ratio)
g1
ref <- mshape(radius.gpa$coords)
ref_obj <- read.ply("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_radius.ply", ShowSpecimen = F)
ref_lnd <- readland.fcsv("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_radius.fcsv")
warped_ref <- warpRefMesh(ref_obj, ref_lnd, ref, color = "ivory", centered = FALSE)
g2 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$min, mesh=warped_ref, method = "surface")
g3 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$max, mesh=warped_ref, method = "surface")
g4 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$min, mesh=warped_ref, method = "surface")
g5 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$max, mesh=warped_ref, method = "surface")
mesh2ply(warped_ref, "C:/Subir/Projects/ducks/pc_mesh/radius_ref")
mesh2ply(g2, "C:/Subir/Projects/ducks/pc_mesh/radius_pc1_min")
mesh2ply(g3, "C:/Subir/Projects/ducks/pc_mesh/radius_pc1_max")
mesh2ply(g4, "C:/Subir/Projects/ducks/pc_mesh/radius_pc2_min")
mesh2ply(g5, "C:/Subir/Projects/ducks/pc_mesh/radius_pc2_max")
logfile <- parser("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/ulna_landmarks/gpa/2024-03-20_10_31_20/analysis.log", forceLPS = T)
ulna.gpa <- gpagen(logfile$LM, print.progress = F)
duck_tree <- keep.tip(duck_tree_orig, duck_data$Scientific_Name[duck_data$Ulna %in% "X"])
dimnames(ulna.gpa$coords)[[3]] <- unlist(lapply(str_split(dimnames(ulna.gpa$coords)[[3]], "_"), FUN = function(x){paste(x[1], x[2], sep="_")}))
pca <- gm.prcomp(ulna.gpa$coords, phy = duck_tree)
# pc1 (28.35%), pc2 (19.04%), pc3 (14.70%), pc4 (7.03%), pc5 (5.61%)
asp_ratio <- 1.618
duck_pca <- data.frame(species=names(pca$x[,1]), PC1=pca$x[,1], PC2=pca$x[,2], PC3=pca$x[,3], PC4=pca$x[,4], PC5=pca$x[,5])
duck_pca <- merge(duck_pca, duck_data[,c("Scientific_Name", "Images", "Ulna")], by.x = "species", by.y = "Scientific_Name", all.x = T, all.y = F)
g1 <- ggplot(duck_pca, aes(PC1, PC2)) +
geom_image(aes(image=Images), size=0.075, by = "width", asp = asp_ratio)+
theme_classic()+
theme(aspect.ratio = 1/asp_ratio)
g1
ref <- mshape(ulna.gpa$coords)
ref_obj <- read.ply("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_ulna.ply", ShowSpecimen = F)
ref_lnd <- readland.fcsv("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_ulna.fcsv")
warped_ref <- warpRefMesh(ref_obj, ref_lnd, ref, color = "ivory", centered = FALSE)
g2 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$min, mesh=warped_ref, method = "surface")
g3 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$max, mesh=warped_ref, method = "surface")
g4 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$min, mesh=warped_ref, method = "surface")
g5 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$max, mesh=warped_ref, method = "surface")
mesh2ply(warped_ref, "C:/Subir/Projects/ducks/pc_mesh/ulna_ref")
mesh2ply(g2, "C:/Subir/Projects/ducks/pc_mesh/ulna_pc1_min")
mesh2ply(g3, "C:/Subir/Projects/ducks/pc_mesh/ulna_pc1_max")
mesh2ply(g4, "C:/Subir/Projects/ducks/pc_mesh/ulna_pc2_min")
mesh2ply(g5, "C:/Subir/Projects/ducks/pc_mesh/ulna_pc2_max")
logfile <- parser("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Carpometacarpus_landmarks/gpa/2024-03-19_11_18_33/analysis.log", forceLPS = T)
carpometacarpus.gpa <- gpagen(logfile$LM, print.progress = F)
duck_tree <- keep.tip(duck_tree_orig, duck_data$Scientific_Name[duck_data$Carpometacarpus %in% "X"])
dimnames(carpometacarpus.gpa$coords)[[3]] <- unlist(lapply(str_split(dimnames(carpometacarpus.gpa$coords)[[3]], "_"), FUN = function(x){paste(x[1], x[2], sep="_")}))
pca <- gm.prcomp(carpometacarpus.gpa$coords, phy = duck_tree)
# pc1 (16.79%), pc2 (15.00%), pc3 (14.15%), pc4 (8.73%), pc5 (8.11%)
asp_ratio <- 1.618
duck_pca <- data.frame(species=names(pca$x[,1]), PC1=pca$x[,1], PC2=pca$x[,2], PC3=pca$x[,3], PC4=pca$x[,4], PC5=pca$x[,5])
duck_pca <- merge(duck_pca, duck_data[,c("Scientific_Name", "Images", "Carpometacarpus")], by.x = "species", by.y = "Scientific_Name", all.x = T, all.y = F)
g1 <- ggplot(duck_pca, aes(PC1, PC2)) +
geom_image(aes(image=Images), size=0.075, by = "width", asp = asp_ratio)+
theme_classic()+
theme(aspect.ratio = 1/asp_ratio)
g1
ref <- mshape(carpometacarpus.gpa$coords)
ref_obj <- read.ply("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_carpometacarpus.ply", ShowSpecimen = F)
ref_lnd <- readland.fcsv("C:/Users/subirshakya/OneDrive - Harvard University/Ducks/Recalibrated/Anas_platyrhynchos_MCZ347156_carpometacarpus.fcsv")
warped_ref <- warpRefMesh(ref_obj, ref_lnd, ref, color = "ivory", centered = FALSE)
g2 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$min, mesh=warped_ref, method = "surface")
g3 <- plotRefToTarget(ref, pca$shapes$shapes.comp1$max, mesh=warped_ref, method = "surface")
g4 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$min, mesh=warped_ref, method = "surface")
g5 <- plotRefToTarget(ref, pca$shapes$shapes.comp2$max, mesh=warped_ref, method = "surface")
mesh2ply(warped_ref, "C:/Subir/Projects/ducks/pc_mesh/carpometacarpus_ref")
mesh2ply(g2, "C:/Subir/Projects/ducks/pc_mesh/carpometacarpus_pc1_min")
mesh2ply(g3, "C:/Subir/Projects/ducks/pc_mesh/carpometacarpus_pc1_max")
mesh2ply(g4, "C:/Subir/Projects/ducks/pc_mesh/carpometacarpus_pc2_min")
mesh2ply(g5, "C:/Subir/Projects/ducks/pc_mesh/carpometacarpus_pc2_max")
duck_pca_genome <- duck_pca[which(duck_pca$species %in% duck_data$Scientific_Name[!is.na(duck_data$Genomes)]),]
duck_tree_genome <- keep.tip(duck_tree, duck_pca_genome$species)
duck_pca_genome$species <- factor(duck_pca_genome$species, levels = duck_tree_genome$tip.label)
duck_pca_genome <- duck_pca_genome[order(duck_pca_genome$species),]
p <- ggtree(duck_tree_genome) +
xlim(NA,0.2)+
geom_tiplab(size=3)+
geom_facet(panel = "Femur_PC1", geom = geom_col, data = duck_pca_genome,
mapping = aes(x = PC1), orientation='y', width=0.6)+
geom_tiplab(aes(image = c(duck_pca_genome$Images, rep("NA", 24))), geom="image",
offset=0.1, align = T)
facet_widths(p, c(0.75, 0.25))
duck_pca_genome <- duck_pca[which(duck_pca$species %in% duck_data$Scientific_Name[!is.na(duck_data$Genomes)]),]
duck_tree_genome <- keep.tip(duck_tree, duck_pca_genome$species)
duck_pca_genome$species <- factor(duck_pca_genome$species, levels = duck_tree_genome$tip.label)
duck_pca_genome <- duck_pca_genome[order(duck_pca_genome$species),]
p <- ggtree(duck_tree_genome) +
xlim(NA,0.2)+
geom_tiplab(size=3)+
geom_facet(panel = "tarsometatarsus_PC1", geom = geom_col, data = duck_pca_genome,
mapping = aes(x = PC1), orientation='y', width=0.6)+
geom_tiplab(aes(image = c(duck_pca_genome$Images, rep("NA", 22))), geom="image",
offset=0.1, align = T)
facet_widths(p, c(0.75, 0.25))
duck_pca_genome <- duck_pca[which(duck_pca$species %in% duck_data$Scientific_Name[!is.na(duck_data$Genomes)]),]
duck_tree_genome <- keep.tip(duck_tree, duck_pca_genome$species)
duck_pca_genome$species <- factor(duck_pca_genome$species, levels = duck_tree_genome$tip.label)
duck_pca_genome <- duck_pca_genome[order(duck_pca_genome$species),]
p <- ggtree(duck_tree_genome) +
xlim(NA,0.2)+
geom_tiplab(size=3)+
geom_facet(panel = "humerus_PC1", geom = geom_col, data = duck_pca_genome,
mapping = aes(x = PC1), orientation='y', width=0.6)+
geom_tiplab(aes(image = c(duck_pca_genome$Images, rep("NA", 26))), geom="image",
offset=0.1, align = T)
facet_widths(p, c(0.75, 0.25))
duck_pca_genome <- duck_pca[which(duck_pca$species %in% duck_data$Scientific_Name[!is.na(duck_data$Genomes)]),]
duck_tree_genome <- keep.tip(duck_tree, duck_pca_genome$species)
duck_pca_genome$species <- factor(duck_pca_genome$species, levels = duck_tree_genome$tip.label)
duck_pca_genome <- duck_pca_genome[order(duck_pca_genome$species),]
p <- ggtree(duck_tree_genome) +
xlim(NA,0.2)+
geom_tiplab(size=3)+
geom_facet(panel = "carpometacarpus_PC1", geom = geom_col, data = duck_pca_genome,
mapping = aes(x = PC1), orientation='y', width=0.6)+
geom_tiplab(aes(image = c(duck_pca_genome$Images, rep("NA", 19))), geom="image",
offset=0.1, align = T)
facet_widths(p, c(0.75, 0.25))
We aligned 27 duck genomes (GenBank numbers in table) and also included the chicken genomes using progressive cactus. The tree used for the alignment is shown below.
(((((((((Aix_galericulata:0.0051473029,Aix_sponsa:0.0053483288):0.0017143373,(Cairina_moschata:0.0016771265,Asarcornis_scutulata:0.0016771265):0.0050313795):0.0030667213,((((Lophonetta_specularioides:0.0100951666,Tachyeres_brachypterus:0.0055556181):0.0028194957,((Anas_acuta:0.0042863335,(Anas_platyrhynchos:0.00124933445,Anas_zonorhyncha:0.00124933445):0.00374800335):0.001201751,(Mareca_sibilatrix:0.0037871541,Mareca_strepera:0.0037867301):0.0014935713):0.0011642544):0.0004165566,Spatula_discors:0.007856085):0.0023604916,Aythya_fuligula:0.009531404):0.0012462578):0.0009209619,(Melanitta_fusca:0.0064818964,(Clangula_hyemalis:0.0070627445,Somateria_mollissima:0.0073593856):0.0003851771):0.0014142496):0.0173506804,((((Anser_fabalis:0.0020284789,Anser_cygnoides:0.001826199):0.0008868183,Anser_indicus:0.0025060375):0.0036566335,Branta_canadensis:0.006571712):0.0029378261,Cygnus_olor:0.0091439003):0.0085572595):0.000654519,(((Heteronetta_atricapilla:0.0213298248,Stictonetta_naevosa:0.0132137813):0.0004721905,Oxyura_jamaicensis:0.0155460859):0.0065766967,Nettapus_auritus:0.0249469325):0.0034281317):0.0328791421,Anseranas_semipalmata:0.0308188616):0.01,Anhima_cornuta:0.0500106949):0.01,Gallus_gallus:0.0308188616);
To run cactus:
#!/bin/bash
#SBATCH -c 1
#SBATCH -t 99-00:00
#SBATCH -p holy-smokes
#SBATCH --nodelist=holy7c0908
#SBATCH --mem=1000
#SBATCH -o ./logs/cac_%j.out
#SBATCH -e ./logs/cac_%j.err
cd /n/holyscratch01/informatics/sshakya/ducks/
source activate PhyloAcc
singularity exec --cleanenv cactus_v2.7.1-gpu.sif cactus-prepare duck_aln.txt --outDir snk_output --jobStore snk_temp --gpu
snakemake -p -s ~/GenomeAnnotation/genome_alignment2/cactus-gpu-snakemake/cactus_gpu.smk --configfile duck_aln.yaml --profile ~/GenomeAnnotation/genome_alignment2/cactus-gpu-snakemake/profiles/slurm_profile/
To get neutral rates, we used the chicken annotations. See conserved_rates for steps to get neutral rates.
We used the coordinates from conserved elements produced in Shakya et al. for getting alignments of conserved elements. See elements_fasta for steps to get conserved elements.
We ran PhyloAcc-C model for PC1 of each of the four selected bones. To run PhyloAcc-C, we used the following code:
library(coda)
library(parallel)
suppressMessages(library(sm))
library(argparse)
library(stringi)
library(ape)
ELL_do_mcmc = function(row_i) {
mclapply(1:args$runs, function(i) {
pac3::run_model(tree=tree, alignment=(all_alignments[[row_i]]),
trait_tbl=trait_tbl,
transition_matrix=Q, stationary_distro=PI, n_iter=args$iter,
inner_iter=500,
Phi_prior=c(1.0, 1.0, 1.0, 1.0, 1.0, 1.0),
prior_r2=c(4,0.1),
prior_r3=c(10,0.4))
}, mc.cores=1) # N.B. set to 1 = lapply, so do parallelism at element level
}
ELL_make_mcmclist = function(mcmc_outs, n_burnin=n_burnin, n_thin=1) {
cs <- lapply(mcmc_outs, function(mcmc_out) {
the_trace = with(mcmc_out$mcmc_trace,
data.frame(lv =v_trace[,1],
lb2=v_trace[,2],
lb3=v_trace[,3]))
as.mcmc(the_trace, start=(n_burnin+1), thin=n_thin)
})
as.mcmc.list(cs)
}
do_ith_element = function(i) {
print(paste("Working on element ",i," of ",length(all_alignments)," of batch ",ceiling(start_index/args$batch_size) , sep=""))
the_mcmc_outs = ELL_do_mcmc(i)
the_trace = do.call(rbind, lapply(1:args$runs, function(i) the_mcmc_outs[[i]]$mcmc_trace$v_trace[-(1:args$burnin),])) # needs the rbind otherwise don't get correct quantiles
the_mcmc_list = ELL_make_mcmclist(the_mcmc_outs)
rhat = gelman.diag(the_mcmc_list)$mpsrf
the_density = sm.density(the_trace[,2:3], model="Normal", eval.grid=F, eval.points=matrix(data=c(0,0), byrow=T, nrow=1, ncol=2), display="none")
the_density2= sm.density(the_trace[,3]-the_trace[,2], model="Normal", eval.grid=F, eval.points=0, display="none")
qlv = as.numeric(quantile(the_trace[,1], prob=c(.1,.5,.9)))
qlb2 = as.numeric(quantile(the_trace[,2], prob=c(.1,.5,.9)))
qlb3 = as.numeric(quantile(the_trace[,3], prob=c(.1,.5,.9)))
qrat = as.numeric(quantile(the_trace[,3]- the_trace[,2], prob=c(.1,.5,.9)))
list( lv.10=qlv [1], lv.50=qlv [2], lv.90=qlv [3],
lb2.10=qlb2[1], lb2.50=qlb2[2], lb2.90=qlb2[3],
lb3.10=qlb3[1], lb3.50=qlb3[2], lb3.90=qlb3[3],
rat.10=qrat[1], rat.50=qrat[2], rat.90=qrat[3],
rhat=as.numeric(rhat),
dens0=as.numeric(the_density$estimate),
densr=as.numeric(the_density2$estimate))
}
do_these_elements = function(indices) {
mclapply(indices, function(i) {
the_res = do_ith_element(i)
the_res$NAME = names(all_alignments[i])
the_res
}, mc.cores=args$cores)
}
# Create Argument Parser
parser <- ArgumentParser(description = "Run PhyloAcc-c")
# Add arguments
parser$add_argument("--mod_file", type="character", help = "Path to mod file with tree and neutral rates")
parser$add_argument("--output", type="character", help = "Output file name")
parser$add_argument("--output_folder", default="./", type="character", help = "Output folder name (default = ./)")
parser$add_argument("--fasta", type="character", help = "Fasta folder")
parser$add_argument("--fasta_list", type="character", help = "Optional list to only include from fasta")
parser$add_argument("--traits", type="character", help = "Tab-delimited file with species and trait")
parser$add_argument("--batch_size", type="integer", default=100, help = "How many elements to run and write at one time")
parser$add_argument("--runs", type="integer", default=3, help = "Number of runs per element (default=3)")
parser$add_argument("--cores", type="integer", default=1, help = "How many cores to run in parallel")
parser$add_argument("--iter", type="integer", default=10000, help = "Number of iterations per element (default=10000)")
parser$add_argument("--burnin", type="integer", default=1000, help = "Number of iterations of burnin to discard (default=1000)")
parser$add_argument("--max_missing", type="double", default=0.5, help = "Fraction of taxa need to be present in alignment (default=0.5)")
# Parse arguments
args <- parser$parse_args()
print(args)
# Parse mod file
lines <- readLines(args$mod_file)
for (line_index in seq_along(lines)) {
# Split the line by ":"
parts <- stri_split_fixed(lines[line_index], ":", n=2)[[1]]
# Extract variable name and IDs
variable <- trimws(parts[1]) # Trim whitespace
if (variable == "BACKGROUND") {
PI <- as.numeric(unlist(strsplit(trimws(parts[2]), "\\s+")))
}
if (variable == "RATE_MAT"){
mat_vals <- strsplit(lines[(line_index+1):(line_index+4)], "\\s+")
Q <- do.call(rbind, mat_vals)
Q <- Q[,-1]
Q <- matrix(as.numeric(Q), nrow=4, byrow=T)
line_index <- line_index + 4
}
if (variable == "TREE"){
tree = read.tree(text = trimws(parts[2]))
tree$node.label <- paste("Anc", 0:Nnode(tree), sep="")
}
}
# Get trait data
trait_tbl <- read.table(args$traits)
colnames(trait_tbl) <- c("NAME", "Y")
tree <- keep.tip(tree, trait_tbl$NAME)
trait_tbl$NAME <- factor(trait_tbl$NAME, levels = tree$tip.label)
trait_tbl <- trait_tbl[order(trait_tbl$NAME),]
row.names(trait_tbl) <- 1:nrow(trait_tbl)
alignment_loc <- args$fasta
fasta_files <- list.files(alignment_loc, full.names = T, include.dirs = F)
fasta_basenames <- tools::file_path_sans_ext(basename(fasta_files))
if (!is.null(args$fasta_list)){
keep_fasta <- read.table(args$fasta_list)
fasta_files <- fasta_files[which(fasta_basenames %in% keep_fasta$V1)]
fasta_basenames <- fasta_basenames[which(fasta_basenames %in% keep_fasta$V1)]
}
#some basic housekeeping
start_run <- T
if (!file.exists(args$output_folder)) {
# If it doesn't exist, create the output folder
dir.create(args$output_folder)
}
# Loop in batches of batch size
for (start_index in seq(1, length(fasta_files), by = args$batch_size)){
all_alignments <- list()
for (each_file in start_index:(start_index + args$batch_size - 1)){
element <- fasta_basenames[each_file]
alignment <- read.dna(fasta_files[each_file], format = "fasta", as.matrix = T, as.character = T)
if ((nrow(alignment) > ((length(tree$tip.label))*args$max_missing))) {
alignment <- alignment[rownames(alignment) %in% trait_tbl$NAME,]
alignment[alignment=="*"] <- "-"
alignment[alignment=="n"] <- "-"
all_same <- all(apply(alignment, 2, function(x) length(unique(x))) == 1)
missing_taxa <- setdiff(tree$tip.label, rownames(alignment))
if (length(missing_taxa > 0)){
for (miss in missing_taxa){
missing_row <- matrix(rep("-", ncol(alignment)), nrow=1)
rownames(missing_row) <- miss
alignment <- rbind(alignment, missing_row)
}
}
if (!all_same){
all_alignments <- append(all_alignments, setNames(list(alignment), element))
}
}
}
if(length(all_alignments) > 0){
system.time(yyy <- do_these_elements(1:length(all_alignments)))
tbl_to_write = do.call(rbind, yyy)
out_fn = paste(args$output_folder, "/", args$output, ".tsv", sep="")
if (file.exists(out_fn) & start_run){
write.table(tbl_to_write, out_fn, sep="\t", row.names=F, quote=F, col.names=T)
start_run <- F
}
write.table(tbl_to_write, out_fn, sep="\t", row.names=F, quote=F, col.names=F, append=T)
}
}
This script can then be run as such:
scaffold="NC_052572.1"
Rscript run_phyloaccc.r --mod_file ducks_4d_Z_non-conserved.mod --output ${scaffold} --output_folder phylo_out_carpometacarpus --trait duck_carpometacarpus_pc1.txt --fasta fasta/fasta_Z/${scaffold}/ --fasta_list bed_ranges/${scaffold}.txt --batch_size 200 --cores 40 --iter 10000 --burnin 1000