HE HE! 🙂
Data Scientist vs Data Analyst (funny)
Hottest off the press…
Data Science Platform Speeds Python Queries
News from Brown, Kevin Stacey. July 1, 2021
Researchers at Brown University and the Massachusetts Institute of Technology have crafted a new data science platform to run queries written in Python more efficiently. Python’s simplicity makes it data scientists’ preferred coding language for generating user-defined functions, but analytics platforms run into trouble in efficiently handling these bits of Python code. The Tuplex framework is designed to compile a highly specialized program for specific query and common-case input data, filtering out uncommon input data and referring it to an interpreter. Brown’s Leonhard Spiegelberg said, “This allows us to simplify the compilation problem as we only need to care about a single set of data types and common-case assumptions.”
OLD BE HERE!
Deep ML Completes Information About the Bioactivity of 1 Million Molecules
IRB Barcelona (Spain)
June 28, 2021
Deep machine learning (ML) computational models have deduced experimental data for 1 million chemical compounds, guiding the development of a suite of programs for creating estimates of any type of molecule. Scientists at Spain’s Institute for Research in Medicine Barcelona based the technique on the Chemical Checker, the largest database of bioactivity profiles for pseudo pharmaceuticals. The database is missing critical data, which the new tool provides by integrating all the experimental information available so the bioactivity profiles for all molecules can be completed. Said the Institute’s Patrick Aloy, “The new tool also allows us to forecast the bioactivity spaces of new molecules, and this is crucial in the drug discovery process as we can select the most suitable candidates and discard those that, for one reason or another, would not work.”
DNA-Based Storage System with Files and Metadata
June 15, 2021
A new DNA-based system for storing images created by researchers at the Massachusetts Institute of Technology (MIT) and the Broad Institute encapsulates data-encoding DNA file sequences within silicon-dioxide glass beads that are surface-labeled with fluorescent single-stranded DNA tags. The tagged beads are blended into a data library that benefits from long-term stability and zero-energy maintenance. The researchers stored a keyword-associated archive of images in the DNA, with each keyword encoded in the DNA attached to the bead’s surface. The system permits Boolean searches of multiple terms, and since each tag can be viewed as a piece of metadata about the DNA-stored image, the beads collectively function as a metadata-driven image database.
Researchers Identify 16 Medicines That Could Be Used to Treat COVID-19
June 16, 2021
Researchers from the ESI International Chair of the CEU Cardenal Herrera University and ESI Group used a new computational topology strategy to determine which existing medicines could be used to treat COVID-19. The model uses topologic data analysis to compare the three-dimensional structure of the target proteins of known medicines to SARS-CoV-2 coronavirus proteins, such as protein NSP12. The researchers studied 1,825 medicines approved by the U.S. Food and Drug Administration, which are connected to 27,830 protein structures. In comparing the topological structure of the proteins available in the Protein Data Bank with the 23 proteins of the SARS-CoV-2 coronavirus, they identified 16 medicines that act against the three viral proteins found to have highly significant topological similarities to target protein structures of the known medicines. These drugs can now be studied to determine the most effective combination of them to treat COVID-19 symptoms.
Computer Simulations Visualize in Atomic Detail How DNA Opens While Wrapped Around Proteins Hubrecht Institute (Netherlands, June 3, 2021)
Computer models developed by researchers at the Netherlands’ Hubrecht Institute and Germany’s Max Planck Institute for Molecular Biomedicine provide atomic-scale visuals of DNA transitioning from an inactive (closed) to an active (open) state while sheathing proteins. The simulations included real-time animations of nucleosomes (constituent elements of the chromatin that packs DNA within the cell nucleus) over one-microsecond intervals. The researchers used these films to track the nucleosomes’ opening-closing motion, and Hubrecht’s Vlad Cojocaru said increasing computational power soon will enable the team to model milliseconds of a nucleosome’s lifecycle. Cojocaru said the ability to simulate multiple nucleosomes “will give unprecedented insights into the mechanisms that regulate gene expression.”
Hottest off the press AKA now 🙂
A new way to visualize mountains of biological data
A machine learning method developed by engineers and scientists at the University of Missouri (MU) and Ohio State University can analyze massive volumes of data from single-cell RNA-sequencing, with potential applications in precision medicine. The technique applies a graph neural network to generate a visual representation of the analyzed data, in order to help identify patterns easily. The graph is made up of dots, each representing a cell, and similar cell types are color-coded for easy recognition. MU’s Dong Xu said, “With this data, scientists can study the interactions between cells within the micro-environment of a cancerous tissue, or watch the T-cells, B-cells, and immune cells all try to attack the cancerous cells.”
Speeding New COVID Treatments with Computational Tool (University of New Mexico)
Scientists at the University of New Mexico (UNM) and the University of Texas at El Paso have developed a computational tool to help drug researchers quickly identify anti-COVID molecules before the virus invades human cells or disable it in the early stages of infection. The team unveiled REDIAL-2020, an open source suite of computational models that can help to rapidly screen small molecules for potential COVID-fighting traits. REDIAL-2020 is based on machine learning (ML) algorithms that quickly process massive volumes of data and tease out patterns that might be missed by human researchers. The team validated the ML forecasts by comparing datasets from the National Center for Advancing Translational Sciences to the known effects of approved drugs in UNM’s DrugCentral database.
SMU’s ChemGen completes essential drug discovery work in days
Southern Methodist University (SMU) researchers have developed ChemGen, a set of computer-driven routines that emulate chemical reactions in a laboratory, significantly reducing the time and costs of drug discovery. ChemGen accelerates pharmacological optimization from months to days, using high-performance computers like SMU’s ManeFrame II shared high-performance computing cluster. The tool computationally generates molecular variants of the original chemical key, mimicking reactions under various combinations of circumstances. SMU’s John Wise said, “A research group or pharmaceutical company need only actually synthesize the molecules with the best chances of being improved, leaving the thousands of unimproved molecules in the computer and not on the lab bench.”
Just For fun 🙂
Insta360 VR: Flying Over Iceland Volcano – A Virtual Reality Experienc
TO THE EARTH’S CORE 360° – VR Video
OLDER STUFF BELOW…
SARS-CoV-2 gene & COVID-19 mutation impact by comparing 44 Sarbecovirus genomes
Happy New Year Edition!
Google’s deep-learning program for determining the 3D shapes of proteins stands to transform biology, say scientists.
AlphaFold: The making of a scientific breakthrough
Protein folding explained
Other Misc. Stuff
New Oct 2020
Deep learning gives drug design a boost
Tuesday August 25, 2020 8:00 pm – 9:00 pm
Q&A with Scientists and Producer/Director of Controversy to Cure
HOT OFF PRESS July 11th
A new high-resolution, 3-D map of the whole mouse brain (Allen Institute for Brain Science)
See a 3D mouse brain with single-cell resolution (AAAS)
Why are stock prices going up when the economy is in ruins? Here’s some helpful context (Fortune)
Eleven tips for working with large data sets (Nature)
Big data are difficult to handle. These tips and tricks can smooth the way.
Same/similar as story below, or not?
Deep Learning, 3D Technology to Improve Structure Modeling for Protein Interactions, Create Better Drugs (Purdue University News)
Researchers at Purdue University have developed a system that applies deep learning principles to virtual models of protein interactions in an effort to better understand how proteins interact in the body, with the goal of developing better drugs that specifically target these interactions. The DOVE (DOcking decoy selection with Voxel-based deep neural nEtwork) system captures the structural and energetic features of the interface of a protein docking model with a three-dimensional (3D) box and judges whether the model is more likely to be correct or incorrect through the use of a 3D convolutional neural network. Said Purdue’s Daisuke Kihara, “This may be the first time researchers have successfully used deep learning and 3D features to quickly understand the effectiveness of certain protein models.”
A Digital Approach to Targeting Proteins in Disease
Purdue University News
Researchers at Purdue University have developed software that better targets specific sites on proteins in the human body, helping scientists to create more effective drugs to treat cancers and other diseases. The NmrLineGuru software was designed for “fast nuclear magnetic resonance lineshape simulation and analysis with multistate equilibrium binding models,” according to Purdue’s Chao Feng. NmrLineGuru allows researchers and scientists to learn more about proteins linked to serious diseases, in order to help in the drug discovery process. The software will permit deeper study into the interactions between the proteins involved in a disease state and the molecules that connect them. Said Feng, “Our work stems from a deep understanding of the need in the drug discovery field for better and faster solutions to understand how to affect biological functions of proteins in the body.”
Brain-like functions emerging in a metallic nanowire network (EurekAlert, AAAS) aka folks who do journal Science 🙂
Brand Spanking New Nov 8th 🙂
‘Big data’ for life sciences—A human protein co-regulation map reveals new insights into protein functions
Even more brand-spanking NEW! 🙂
Language-Based Software’s Accurate Predictions Translate to Benefits for Chemists
September 30, 2019
Researchers at the University of Cambridge in the U.K. have developed a software program that can predict chemical reaction outcomes and retrosynthetic steps. The Molecular Transformer software uses a new type of neural network that is easier to train and more accurate than those that powered earlier translation-based approaches to chemistry. The Molecular Transformer is built around a neural network with a transformer architecture. Neural networks based on transformer architecture makes heavy use of a mechanism called attention, which allows them to learn which parts of the input are relevant to each part of the output, regardless of their positions. This reduces the amount of training needed and improves the resulting language models’ accuracy. The team found that the Molecular Transformer outperformed other language-based approaches, predicting the correct reaction outcome 90% of the time.
Numbers Limit How Accurately Digital Computers Model Chaos
September 24, 2019
Researchers at University College London (UCL) in the U.K. and Tufts University found digital computers employ numbers that are based on flawed versions of actual numbers, which may lead to inaccuracies in simulations of chaotic systems and limit high-performance computing and machine learning applications. Digital computers only use rational numberswhich can be expressed as fractionsand these fractions’ denominators must be a power of 2. The researchers used 4 billion single-precision floating-point numbers, ranging from plus to minus infinity, to compare the mathematical reality of a one-parameter chaotic system to digital systems’ forecasts if all available single-precision floating-point numbers were utilized. The predictions were completely incorrect for certain values of the parameter, while calculations for others appeared correct, but deviated by up to 15%. Said UCL’s Peter Coveney, “Chaos is more commonplace than many people may realize and even for very simple chaotic systems, numbers used by digital computers can lead to errors that are not obvious but can have a big impact. Ultimately, computers can’t simulate everything.”
NEW #1 September 13
Summary from video…
Ginkgo Bioworks, a Boston company specializing in “engineering custom organisms,” aims to reinvent manufacturing, agriculture, biodesign, and more.
Biologists, software engineers, and automated robots are working side by side to accelerate the speed of nature by taking synthetic DNA, remixing it, and programming microbes, turning custom organisms into mini-factories that could one day pump out new foods, fuels, and medicines.
While there are possibly numerous positive and exciting outcomes from this research, like engineering gut bacteria to produce drugs inside the human body on demand or building self-fertilizing plants, the threat of potential DNA sequences harnessing a pathological function still exists.
That’s why Ginkgo Bioworks is developing a malware software to effectively stomp out the global threat of biological weapons, ensuring that synthetic biology can’t be used for evil.
Learn more about synthetic DNA and this biological assembly line on this episode of Focal Point.
Our scientific understanding of the universe is advancing at an unprecedented rate. Join Focal Point as we meet the people building tomorrow’s world. Witness the astonishing discoveries that will propel humanity forward and zero-in on the places where science-fiction becomes science-reality.
Seeker empowers the curious to understand the science shaping our world. We tell award-winning stories about the natural forces and groundbreaking innovations that impact our lives, our planet, and our universe.
NEW # 2 September 13 Check out!
AI Learns the Language of Chemistry to Predict How to Make Medicines
University of Cambridge, September 3, 2019
Researchers at the University of Cambridge in the U.K. have developed a machine learning algorithm that predicts the result of chemical reactions with greater accuracy than human chemists and suggests ways to create complex molecules. The algorithm trained on millions of reactions published in patents using pattern recognition tools to identify how chemical groups in molecules react. The team thought of chemical reaction prediction as a machine translation problem in which the reacting molecules are considered one “language,” while the product is considered a different language. The model uses the patterns in the text to learn how to “translate” between the two languages. The team found the model was 90% accurate in predicting the correct product of unseen chemical reactions, while trained human chemists achieve around 80% accuracy.
NEW #3 September 13, Just for a chuckle 🙂
IBM Gives Cancer-Killing Drug AI Project to the Open Source Community
July 22, 2019
IBM has released to the open source community three artificial intelligence (AI) projects designed to address the challenge of curing cancer. The projects, led by researchers at IBM’s Computational Systems Biology Group in Switzerland, involve developing AI and machine learning approaches to help accelerate the understanding of the leading drivers and molecular mechanisms of different cancers. The first project, PaccMann, is working to develop an algorithm that can automatically analyze chemical compounds and predict which are most likely to overcome cancer strains. The second project, “Interaction Network infErence from vectoR representATions of words” (INtERAcT), aims to develop a tool that can automatically extract information from the thousands of papers published every year on cancer research. The third project, “pathway-induced multiple kernel learning,” focuses on an algorithm that uses datasets describing what is currently known about molecular interactions to predict the prognosis of cancer patients.
Russia Targeted 2016 State Elections with ‘Unprecedented Level of Activity,” Senate Intel Report Says
July 25, 2019
A report released by the Senate Intelligence Committee provided new information on the “unprecedented” Russian cyber activity targeting U.S. election infrastructure ahead of the 2016 presidential election, finding that election systems in all 50 states were targeted. The committee released its preliminary findings on election security in May last year, and will release four more final installments on other areas of focus. Russian intentions for U.S. election infrastructure “remain unclear,” said the latest report, which confirmed the 2018 finding that no votes were changed and no voting machines were manipulated. “Russia might have intended to exploit vulnerabilities in election infrastructure during the 2016 elections and, for unknown reasons, decided not to execute those options,” the report said. “Alternatively, Russia might have sought to gather information in the conduct of traditional espionage activities.” The report called for, among other things, an “overarching cyber doctrine” to outline deterrent measures.
AI Drug Hunters Could Give Big Pharma a Run for Its Money
Robert Langreth, Bloomberg
July 15, 2019
Using the latest neural-network algorithms, DeepMind, the artificial intelligence (AI) arm of Alphabet, beat seasoned biologists at 50 top labs from around the world in predicting the shapes of proteins. The company’s win at the CASP13 meeting in Mexico in December has serious implications, as a tool able to accurately model protein structures could speed up the development of new drugs. Although DeepMind’s simulation was unable to produce the atomic-level resolution necessary for drug discovery, its victory points to the potential for practical application of AI in one of the most expensive and failure-prone parts of the pharmaceutical business. AI could be used, for example, to scan millions of high-resolution cellular images to identify therapies researchers might otherwise have missed. In the short term, experts say AI-based simulations likely will be used to determine whether prospective drugs will be effective before proceeding to a full clinical trial.
AI Drug Hunters Could Give Big Pharma a Run for Its Money (Robert Langreth, Bloomberg)
FRESH OFF THE PRESS JUST NOW 🙂
Why Take an Epsom Salt Bath? (WebMD)
Epsom Salt: Benefits, Uses, and Side Effects – Healthline
What are the benefits of an Epsom salt detox?
Epsom salt detox: Benefits and how it works – Medical News Today
Epson Salt Uses and Benefits
Video Gamers Design Brand-New Proteins
June 4, 2019
A multi-institutional initiative led by University of Washington School of Medicine researchers encoded specialized knowledge into the Foldit computer game to facilitate synthetic protein design. Foldit gamers previously were limited to interacting with known proteins. The researchers added biochemical knowledge by modifying the game’s operating code, so designer molecules that scored well in the game would be more likely to fold up as desired in the real world. The researchers tested 146 Foldit-player-designed proteins in a laboratory, of which 56 exhibited stability; the team compiled sufficient data on four of the new molecules to demonstrate the designs adopted their intended configurations. The University of Massachusetts, Dartmouth’s Firas Khatib suggests the Foldit milestone could aid research into the design of new drugs.
NEWER’ISH, AKA OLDER, NOW
Older Even newer!
‘Virtual Pharmacology’ Advance Tackles Universe of Unknown Drugs
University of California, San Francisco
February 6, 2019
Researchers at the University of California, San Francisco and the University of North Carolina have developed the world’s largest virtual pharmacology platform. The platform, which will soon contain over a billion virtual molecules never before synthesized and not found in nature, is capable of identifying extremely powerful new drugs and is poised to dramatically change early drug discovery. The researchers partnered with Ukraine-based Enamine Ltd., to begin incorporating the company’s vast virtual catalogue of drug-like compounds into their free public drug discovery database, called ZINC. The researchers are in the process of converting hundreds of millions of Enamine’s theoretical molecules into three-dimensional chemical models compatible with a computational pharmacology approach called “docking.” Docking makes it possible to rapidly simulate in three dimensions how hundreds of millions of potential drugs would bind to a specific biological target of interest.
Democratizing Data Science
January 15, 2019
Massachusetts Institute of Technology (MIT) researchers have developed a tool for nonstatisticians that automatically generates models for analyzing raw data. The tool takes in datasets and generates sophisticated statistical models normally used by experts to analyze, interpret, and predict underlying patterns in data. The tool currently resides on Jupyter Notebook, an open source Web framework that allows users to run programs interactively in browsers; users can write just a few lines of code to uncover insights into a range of topics. The system uses Bayesian modeling, a statistical method that continuously updates the probability of a variable as more information about the variable becomes available. The tool uses a modified version of “program synthesis,” a technique that automatically creates computer programs given data and a language to work within. Said MIT’s Feras Saad, “People have a lot of datasets that are sitting around, and our goal is to build systems that let people automatically get models they can use to ask questions about that data.”
Nonprogrammers-data-science (MIT NEWS)
Disease Surveillance Tool Helps Detect Any Human Virus
February 4, 2019
Scientists at the Broad Institute of the Massachusetts Institute of Technology and Harvard University have developed a computational technique to expedite human pathogen monitoring through the design of molecular lures for any such virus and all its known strains, including those with low concentrations in clinical samples. The “CATCH” (Compact Aggregation of Targets for Comprehensive Hybridization) method can help small DNA-sequencing facilities conduct disease surveillance more efficiently, and help control outbreaks. CATCH lets users design tailored sets of probes to collect genetic material from any combination of microbial species. Users can easily input genomes from all known forms of all human viruses uploaded to the National Center for Biotechnology Information’s GenBank sequence database. Tests of CATCH-designed probe sets demonstrated that, following enrichment, viral content comprised 18 times more sequencing data than before, allowing the assembly of genomes that could not be cultivated from un-enriched samples.
Ultra-large virtual molecular libraries throw open chemical space
A library of 350 million drug-like molecules points to potential drugs.
Ultra-large virtual molecular libraries throw open chemical space (Nature.com)