Graham E. FaggThara AngskunGraham E. FaggGeorge BosilcaJelena Pjesivac-GrbovicJack J. DongarraSelf-healing network for scalable fault-tolerant runtime environments.479-485201026Future Gener. Comput. Syst.3https://doi.org/10.1016/j.future.2009.04.001db/journals/fgcs/fgcs26.html#AngskunFBPD10Ralph H. CastainTimothy S. WoodallDavid J. DanielJeffrey M. SquyresBrian BarrettGraham E. FaggThe Open Run-Time Environment (OpenRTE): A transparent multicluster environment for high-performance computing.153-157200824Future Gener. Comput. Syst.2https://doi.org/10.1016/j.future.2007.03.010db/journals/fgcs/fgcs24.html#CastainWDSBF08Jelena Pjesivac-GrbovicThara AngskunGeorge BosilcaGraham E. FaggEdgar GabrielJack J. DongarraPerformance analysis of MPI collective operations.127-143200710Clust. Comput.2https://doi.org/10.1007/s10586-007-0012-0https://www.wikidata.org/entity/Q60307052db/journals/cluster/cluster10.html#Pjesivac-GrbovicABFGD07Jelena Pjesivac-GrbovicGeorge BosilcaGraham E. FaggThara AngskunJack J. DongarraMPI collective algorithm selection and quadtree encoding.613-623200733Parallel Comput.9https://doi.org/10.1016/j.parco.2007.06.005db/journals/pc/pc33.html#Pjesivac-GrbovicBFAD07Thara AngskunGeorge BosilcaGraham E. FaggJelena Pjesivac-GrbovicJack J. DongarraReliability Analysis of Self-Healing Network using Discrete-Event Simulation.437-4442007conf/ccgrid/2007CCGRIDhttps://doi.org/10.1109/CCGRID.2007.95https://doi.ieeecomputersociety.org/10.1109/CCGRID.2007.95db/conf/ccgrid/ccgrid2007.html#AngskunBFPD07Jelena Pjesivac-GrbovicGeorge BosilcaGraham E. FaggThara AngskunJack J. DongarraDecision Trees and MPI Collective Algorithm Selection Problem.107-1172007conf/europar/2007Euro-Parhttps://doi.org/10.1007/978-3-540-74466-5_13db/conf/europar/europar2007.html#Pjesivac-GrbovicBFAD07Jack J. DongarraGeorge BosilcaZizhong ChenVictor EijkhoutGraham E. FaggErika FuentesJulien LangouPiotr LuszczekJelena Pjesivac-GrbovicKeith SeymourHaihang YouSathish S. VadhiyarSelf-adapting numerical software (SANS) effort.223-238200650IBM J. Res. Dev.2-3https://doi.org/10.1147/rd.502.0223db/journals/ibmrd/ibmrd50.html#DongarraBCEFFLLPSYV06Yuan TangGraham E. FaggJack J. DongarraProposal of MPI Operation Level Checkpoint/Rollback and One Implementation.27-342006conf/ccgrid/2006CCGRIDhttps://doi.org/10.1109/CCGRID.2006.81https://doi.ieeecomputersociety.org/10.1109/CCGRID.2006.81db/conf/ccgrid/ccgrid2006.html#TangFD06Jelena Pjesivac-GrbovicGraham E. FaggThara AngskunGeorge BosilcaJack J. DongarraMPI Collective Algorithm Selection and Quadtree Encoding.40-482006conf/pvm/2006PVM/MPIhttps://doi.org/10.1007/11846802_14db/conf/pvm/pvm2006.html#Pjesivac-GrbovicFABD06David DewolfsJan BroeckhoveVaidy S. SunderamGraham E. FaggFT-MPI, Fault-Tolerant Metacomputing and Generic Name Services: A Case Study.133-1402006conf/pvm/2006PVM/MPIhttps://doi.org/10.1007/11846802_24db/conf/pvm/pvm2006.html#DewolfsBSF06Thara AngskunGraham E. FaggGeorge BosilcaJelena Pjesivac-GrbovicJack J. DongarraScalable Fault Tolerant Protocol for Parallel Runtime Environments.141-1492006conf/pvm/2006PVM/MPIhttps://doi.org/10.1007/11846802_25db/conf/pvm/pvm2006.html#AngskunFBPD06Rainer KellerGeorge BosilcaGraham E. FaggMichael M. ReschJack J. DongarraImplementation and Usage of the PERUSE-Interface in Open MPI.347-3552006conf/pvm/2006PVM/MPIhttps://doi.org/10.1007/11846802_48db/conf/pvm/pvm2006.html#KellerBFRD06Edgar GabrielGraham E. FaggJack J. DongarraEvaluating Dynamic Communicators and One-Sided Operations for Current MPI Libraries.67-79200519Int. J. High Perform. Comput. Appl.1https://doi.org/10.1177/1094342005051197db/journals/ijhpca/ijhpca19.html#GabrielFD05Graham E. FaggEdgar GabrielZizhong ChenThara AngskunGeorge BosilcaJelena Pjesivac-GrbovicJack J. DongarraProcess Fault Tolerance: Semantics, Design and Applications for High Performance Computing.465-477200519Int. J. High Perform. Comput. Appl.4https://doi.org/10.1177/1094342005056137db/journals/ijhpca/ijhpca19.html#FaggGCABPD05Jelena Pjesivac-GrbovicThara AngskunGeorge BosilcaGraham E. FaggEdgar GabrielJack J. DongarraPerformance Analysis of MPI Collective Operations.2005conf/ipps/2005IPDPShttps://doi.org/10.1109/IPDPS.2005.335https://doi.ieeecomputersociety.org/10.1109/IPDPS.2005.335db/conf/ipps/ipdps2005.html#Pjesivac-GrbovicABFGD05Zizhong ChenGraham E. FaggEdgar GabrielJulien LangouThara AngskunGeorge BosilcaJack J. DongarraFault tolerant high performance computing by a coding approach.213-2232005conf/ppopp/2005PPoPPhttps://doi.org/10.1145/1065944.1065973db/conf/ppopp/ppopp2005.html#ChenFGLABD05Graham E. FaggGeorge BosilcaAdvanced Message Passing and Threading Issues.72005conf/pvm/2005PVM/MPIhttps://doi.org/10.1007/11557265_5db/conf/pvm/pvm2005.html#FaggB05Graham E. FaggThara AngskunGeorge BosilcaJelena Pjesivac-GrbovicJack J. DongarraScalable Fault Tolerant MPI: Extending the Recovery Algorithm.67-752005conf/pvm/2005PVM/MPIhttps://doi.org/10.1007/11557265_13db/conf/pvm/pvm2005.html#FaggABPD05Julien LangouGeorge BosilcaGraham E. FaggJack J. DongarraHash Functions for Datatype Signatures in MPI.76-832005conf/pvm/2005PVM/MPIhttps://doi.org/10.1007/11557265_14db/conf/pvm/pvm2005.html#LangouBFD05Ralph H. CastainTimothy S. WoodallDavid J. DanielJeffrey M. SquyresBrian BarrettGraham E. FaggThe Open Run-Time Environment (OpenRTE): A Transparent Multi-cluster Environment for High-Performance Computing.225-2322005conf/pvm/2005PVM/MPIhttps://doi.org/10.1007/11557265_31db/conf/pvm/pvm2005.html#CastainWDSBF05David DewolfsDawid KurzyniecVaidy S. SunderamJan BroeckhoveTom DhaeneGraham E. FaggApplicability of Generic Naming Services and Fault-Tolerant Metacomputing with FT-MPI.268-2752005conf/pvm/2005PVM/MPIhttps://doi.org/10.1007/11557265_36db/conf/pvm/pvm2005.html#DewolfsKSBDF05Sathish S. VadhiyarGraham E. FaggJack J. DongarraTowards an Accurate Model for Collective Communications.159-167200418Int. J. High Perform. Comput. Appl.1https://doi.org/10.1177/1094342004041297db/journals/ijhpca/ijhpca18.html#VadhiyarFD04Graham E. FaggJack J. DongarraBuilding and Using a Fault-Tolerant MPI Implementation.353-361200418Int. J. High Perform. Comput. Appl.3https://doi.org/10.1177/1094342004046052db/journals/ijhpca/ijhpca18.html#FaggD04Edgar GabrielGraham E. FaggGeorge BosilcaThara AngskunJack J. DongarraJeffrey M. SquyresVishal SahayPrabhanjan KambadurBrian BarrettAndrew LumsdaineRalph H. CastainDavid J. DanielRichard L. GrahamTimothy S. WoodallOpen MPI: Goals, Concept, and Design of a Next Generation MPI Implementation.97-104https://doi.org/10.1007/978-3-540-30218-6_19https://www.wikidata.org/entity/Q563871532004conf/pvm/2004PVM/MPIdb/conf/pvm/pvm2004.html#GabrielFBADSSKBLCDGW04Timothy S. WoodallRichard L. GrahamRalph H. CastainDavid J. DanielMitchel W. SukalskiGraham E. FaggEdgar GabrielGeorge BosilcaThara AngskunJack J. DongarraJeffrey M. SquyresVishal SahayPrabhanjan KambadurBrian BarrettAndrew LumsdaineOpen MPI's TEG Point-to-Point Communications Methodology: Comparison to Existing Implementations.105-111https://doi.org/10.1007/978-3-540-30218-6_202004conf/pvm/2004PVM/MPIdb/conf/pvm/pvm2004.html#WoodallGCDSFGBADSSKBL04Timothy S. WoodallRichard L. GrahamRalph H. CastainDavid J. DanielMitchel W. SukalskiGraham E. FaggEdgar GabrielGeorge BosilcaThara AngskunJack J. DongarraJeffrey M. SquyresVishal SahayPrabhanjan KambadurBrian BarrettAndrew LumsdaineTEG: A High-Performance, Scalable, Multi-network Point-to-Point Communications Methodology.303-310https://doi.org/10.1007/978-3-540-30218-6_432004conf/pvm/2004PVM/MPIdb/conf/pvm/pvm2004.html#WoodallGCDSFGBADSSKBL04aAlessandro BassiMicah BeckTerry MooreJames S. PlankD. Martin SwanyRichard WolskiGraham E. FaggThe Internet Backplane Protocol: a study in resource sharing.551-561200319Future Gener. Comput. Syst.4https://doi.org/10.1016/S0167-739X(03)00033-5db/journals/fgcs/fgcs19.html#BassiBMPSWF03Edgar GabrielGraham E. FaggJack J. DongarraEvaluating the Performance of MPI-2 Dynamic Communicators and One-Sided Communication.88-97https://doi.org/10.1007/978-3-540-39924-7_162003conf/pvm/2003PVM/MPIdb/conf/pvm/pvm2003.html#GabrielFD03Graham E. FaggJack J. DongarraHARNESS fault tolerant MPI design, usage and performance issues.1127-1142200218Future Gener. Comput. Syst.8https://doi.org/10.1016/S0167-739X(02)00090-0db/journals/fgcs/fgcs18.html#FaggD02Alessandro BassiMicah BeckGraham E. FaggTerry MooreJames S. PlankD. Martin SwanyRichard WolskiThe Internet Backplane Protocol: A Study in Resource Sharing.194-2012002conf/ccgrid/2002CCGRIDhttps://doi.org/10.1109/CCGRID.2002.1017127https://doi.ieeecomputersociety.org/10.1109/CCGRID.2002.1017127db/conf/ccgrid/ccgrid2002.html#BassiBFMPSW02Antoine PetitetL. Susan BlackfordJack J. DongarraBrett EllisGraham E. FaggKenneth RocheSathish S. VadhiyarNumerical Libraries and the Grid.359-374200115Int. J. High Perform. Comput. Appl.4https://doi.org/10.1177/109434200101500403db/journals/ijhpca/ijhpca15.html#PetitetBDEFRV01Graham E. FaggAntonin BukovskyJack J. DongarraHARNESS and fault tolerant MPI.1479-1495200127Parallel Comput.11db/journals/pc/pc27.html#FaggBD01https://doi.org/10.1016/S0167-8191(01)00100-4Sathish S. VadhiyarGraham E. FaggJack J. DongarraTowards an Accurate Model for Collective Communications.41-502001conf/iccS/2001-1International Conference on Computational Science (1)https://doi.org/10.1007/3-540-45545-0_14db/conf/iccS/iccS2001-1.html#VadhiyarFD01Graham E. FaggAntonin BukovskyJack J. DongarraFault Tolerant MPI for the HARNESS Meta-computing System.355-3662001conf/iccS/2001-1International Conference on Computational Science (1)https://doi.org/10.1007/3-540-45545-0_44db/conf/iccS/iccS2001-1.html#FaggBD01Graham E. FaggEdgar GabrielMichael M. ReschJack J. DongarraParallel IO Support for Meta-computing Applications: MPI_Connect IO Applied to PACX-MPI.135-1472001conf/pvm/2001PVM/MPIhttps://doi.org/10.1007/3-540-45417-9_22db/conf/pvm/pvm2001.html#FaggGRD01Antoine PetitetL. Susan BlackfordJack J. DongarraBrett EllisGraham E. FaggKenneth RocheSathish S. VadhiyarNumerical libraries and the grid: the GrADS experiments with ScaLAPACK.142001conf/sc/2001SChttps://doi.org/10.1145/582034.582048https://doi.ieeecomputersociety.org/10.1109/SC.2001.10058db/conf/sc/sc2001.html#PetitetBDEFRV01Graham E. FaggJack J. DongarraFT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World.346-3532000conf/pvm/2000PVM/MPIhttps://doi.org/10.1007/3-540-45255-9_47db/conf/pvm/pvm2000.html#FaggD00Graham E. FaggSathish S. VadhiyarJack J. DongarraACCT: Automatic Collective Communications Tuning.354-3622000conf/pvm/2000PVM/MPIhttps://doi.org/10.1007/3-540-45255-9_48db/conf/pvm/pvm2000.html#FaggVD00Sathish S. VadhiyarGraham E. FaggJack J. DongarraAutomatically Tuned Collective Communications.2000conf/sc/2000SChttps://doi.org/10.1109/SC.2000.10024https://doi.ieeecomputersociety.org/10.1109/SC.2000.10024http://dl.acm.org/citation.cfm?id=370055db/conf/sc/sc2000.html#VadhiyarFD003Micah BeckJack J. DongarraGraham E. FaggAl GeistPaul GrayJames Arthur KohlMauro MigliardiKeith MooreTerry MoorePhilip PapadopoulousHARNESS: a next generation distributed virtual machine.571-582199915Future Gener. Comput. Syst.5-6https://doi.org/10.1016/S0167-739X(99)00010-2db/journals/fgcs/fgcs15.html#BeckDFGGKMMMP99Graham E. FaggKeith MooreJack J. DongarraScalable networked information processing environment (SNIPE).595-605199915Future Gener. Comput. Syst.5-6https://doi.org/10.1016/S0167-739X(99)00012-6db/journals/fgcs/fgcs15.html#FaggMD99Matthias BruneGraham E. FaggMichael M. ReschMessage-passing environments for metacomputing.699-712199915Future Gener. Comput. Syst.5-6https://doi.org/10.1016/S0167-739X(99)00020-5db/journals/fgcs/fgcs15.html#BruneFR99Jack J. DongarraGraham E. FaggAl GeistJames Arthur KohlPhilip M. PapadopoulosStephen L. ScottVaidy S. SunderamM. MagliardiHARNESS: Heterogeneous Adaptable Reconfigurable NEtworked SystemS.358-3591998conf/hpdc/1998HPDChttps://doi.org/10.1109/HPDC.1998.710029https://doi.ieeecomputersociety.org/10.1109/HPDC.1998.710029db/conf/hpdc/hpdc1998.html#DongarraFGKPSSM98Graham E. FaggKevin S. LondonJack J. DongarraMPI_Connect Managing Heterogeneous MPI Applications Ineroperation and Process Control.93-96https://doi.org/10.1007/BFb00565631998conf/pvm/1998PVM/MPIdb/conf/pvm/pvm1998.html#FaggLD98Graham E. FaggJack J. DongarraAl GeistPVMPI Provides Interoperability Between MPI Implementations.1997conf/ppsc/1997PPdb/conf/ppsc/ppsc1997.html#FaggDG97Graham E. FaggJack J. DongarraAl GeistHeterogeneous MPI Application Interoperation and Process Management under PVMPI.91-981997conf/pvm/1997PVM/MPIdb/conf/pvm/pvm1997.html#FaggDG97https://doi.org/10.1007/3-540-63697-8_74Graham E. FaggKeith MooreJack J. DongarraAl GeistScalable Networked Information Processing Environment (SNIPE).241997SChttps://doi.org/10.1145/509593.509617https://doi.ieeecomputersociety.org/10.1109/SC.1997.10018conf/sc/1997db/conf/sc/sc1997.html#FaggMDG97Shirley A. WilliamsGraham E. FaggA Comparison of Developing Codes for Distributed and Parallel Architectures.110-1181996UK Parallelhttps://doi.org/10.1007/978-1-4471-1504-5_8conf/ppsg/1996db/conf/ppsg/ppsg1996.html#WilliamsF96Graham E. FaggKevin S. LondonJack J. DongarraTaskers and General Resource Managers: PVM Supporting DCE Process Management.180-1871996conf/pvm/1996PVMdb/conf/pvm/pvm1996.html#FaggLD96https://doi.org/10.1007/3540617795_23Graham E. FaggShirley A. WilliamsImproved Program Performance Using a Cluster of Workstations.233-23619956Parallel Algorithms Appl.2-3https://doi.org/10.1080/10637199508915511db/journals/paapp/paapp6.html#FaggW95Shirley A. WilliamsPhilip C. H. MitchellGraham E. FaggA Cluster Computing Implemetation of a Monte Carlo Simulation of a Particle Growth Mechanism.275-28019944Parallel Algorithms Appl.3-4https://doi.org/10.1080/10637199408915468db/journals/paapp/paapp4.html#WilliamsMF94Thara AngskunBrian W. BarrettBrian BarrettAlessandro BassiMicah BeckL. Susan BlackfordGeorge BosilcaJan BroeckhoveMatthias BruneAntonin BukovskyRalph H. CastainZizhong ChenDavid J. DanielDavid DewolfsTom DhaeneJack J. DongarraVictor EijkhoutBrett EllisErika FuentesEdgar GabrielAl GeistRichard L. GrahamPaul GrayPrabhanjan KambadurRainer KellerJames Arthur KohlDawid KurzyniecJulien LangouKevin S. LondonAndrew LumsdainePiotr LuszczekM. MagliardiMauro MigliardiPhilip C. H. MitchellKeith MooreTerry MoorePhilip M. PapadopoulosPhilip PapadopoulousAntoine PetitetJelena Pjesivac-GrbovicJames S. PlankMichael M. ReschKenneth RocheVishal SahayStephen L. ScottKeith SeymourJeffrey M. SquyresMitchel W. SukalskiVaidy S. SunderamD. Martin SwanyYuan TangSathish S. VadhiyarShirley WilliamsShirley A. WilliamsRichard WolskiTimothy S. WoodallHaihang You