Category Archives: Internet

Deep learning system

Living in a dynamic physical world, it’s easy to forget how effortlessly we understand our surroundings. With minimal thought, we can figure out how scenes change and objects interact.

But what’s second nature for us is still a huge problem for machines. With the limitless number of ways that objects can move, teaching computers to predict future actions can be difficult.

Recently, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have moved a step closer, developing a deep-learning algorithm that, given a still image from a scene, can create a brief video that simulates the future of that scene.

Trained on 2 million unlabeled videos that include a year’s worth of footage, the algorithm generated videos that human subjects deemed to be realistic 20 percent more often than a baseline model.

The team says that future versions could be used for everything from improved security tactics and safer self-driving cars. According to CSAIL PhD student and first author Carl Vondrick, the algorithm can also help machines recognize people’s activities without expensive human annotations.

“These videos show us what computers think can happen in a scene,” says Vondrick. “If you can predict the future, you must have understood something about the present.”

Vondrick wrote the paper with MIT professor Antonio Torralba and Hamed Pirsiavash, a former CSAIL postdoc who is now a professor at the University of Maryland Baltimore County (UMBC). The work will be presented at next week’s Neural Information Processing Systems (NIPS) conference in Barcelona.

How it works

Multiple researchers have tackled similar topics in computer vision, including MIT Professor Bill Freeman, whose new work on “visual dynamics” also creates future frames in a scene. But where his model focuses on extrapolating videos into the future, Torralba’s model can also generate completely new videos that haven’t been seen before.

Supercomputing system

The new TX-Green computing system at the MIT Lincoln Laboratory Supercomputing Center (LLSC) has been named the most powerful supercomputer in New England, 43rd most powerful in the U.S., and 106th most powerful in the world. A team of experts at TOP500 ranks the world’s 500 most powerful supercomputers biannually. The systems are ranked based on a LINPACK Benchmark, which is a measure of a system’s floating-point computing power, i.e., how fast a computer solves a dense system of linear equations.

Established in early 2016, the LLSC was developed to enhance computing power and accessibility for more than 1,000 researchers across the laboratory. The LLSC uses interactive supercomputing to augment the processing power of desktop systems to process large sets of sensor data, create high-fidelity simulations, and develop new algorithms. Located in Holyoke, Massachusetts, the new system is the only zero-carbon supercomputer on the TOP500 list; it uses energy from a mixture of hydroelectric, wind, solar, and nuclear sources.

In November, Dell EMC installed a new petaflop-scale system, which consists of 41,472 Intel processor cores and can compute 1015 operations per second. Compared to LLSC’s previous technology, the new system provides 6 times more processing power and 20 times more bandwidth. This technology enables research in several laboratory research areas, such as space observation, robotic vehicles, communications, cybersecurity, machine learning, sensor processing, electronic devices, bioinformatics, and air traffic control.

The LLSC mission is to address supercomputing needs, develop new supercomputing capabilities and technologies, and collaborate with MIT campus supercomputing initiatives. “The LLSC vision is to enable the brilliant scientists and engineers at Lincoln Laboratory to analyze and process enormous amounts of information with complex algorithms,” says Jeremy Kepner, Lincoln Laboratory Fellow and head of the LLSC. “Our new system is one of the largest on the East Coast and is specifically focused on enabling new research in machine learning, advanced physical devices, and autonomous systems.”

Because the new processors are similar to the prototypes developed at the laboratory more than two decades ago, the new petaflop system is compatible with all existing LLSC software. “We have had many years to prepare our computing system for this kind of processor,” Kepner says. “This new system is essentially a plug-and-play solution.”

Outsized influence on organizational

Jay W. Forrester SM ’45, professor emeritus in the MIT Sloan School of Management, founder of the field of system dynamics, and a pioneer of digital computing, died Nov. 16. He was 98.

Forrester’s time at MIT was rife with invention. He was a key figure in the development of digital computing, the national air defense system, and MIT’s Lincoln Laboratory. He developed servomechanisms (feedback-based controls for mechanical devices), radar controls, and flight-training computers for the U.S. Navy. He led Project Whirlwind, an early MIT digital computing project. It was his work on Whirlwind that led him to invent magnetic core memory, an early form of RAM for which he holds the patent, in 1949.

MIT Sloan Professor John Sterman, a student, friend, and colleague of Forrester’s since the 1970s, points to a 2003 photo of Forrester on a Segway as an illustration of his work’s lasting impact.

“He really is standing on top of the fruits of his many careers,” Sterman said. “He’s standing on a device that integrates servomechanisms, digital controllers, and a sophisticated feedback control system.”

“From the air traffic control system to 3-D printers, from the software companies use to manage their supply chains to the simulations nations use to understand climate change, the world in which we live today was made possible by Jay’s work,” he said.

Systems dynamics: A new view of management

It was after turning his attention to management in the mid-1950s that Forrester developed system dynamics — a model-based approach to analyzing complex organizations and systems — while studying a General Electric appliance factory. An MIT Technology Review article explores how he sought to combat the factory’s boom-and-bust cycle by examining its “weekly orders, inventory, production rate, and employees.” He then developed a computer simulation of the GE supply chain to show how management practices, not market forces, were causing the cycle.

Forrester’s “Industrial Dynamics” was published in 1961. The field expanded to chart the complexities of economies, supply chains, and organizations. Later, he cast the principles of system dynamics on global issues in “Urban Dynamics,” published in 1969, and “World Dynamics,” published in 1971. The latter was an integrated simulation model of population, resources, and economic growth. Forrester became a critic of growth, a position that earned him few friends.

Plain text into data for statistical analysis

The vast wealth of information unlocked by the Internet, most is plain text. The data necessary to answer myriad questions — about, say, the correlations between the industrial use of certain chemicals and incidents of disease, or between patterns of news coverage and voter-poll results — may all be online. But extracting it from plain text and organizing it for quantitative analysis may be prohibitively time consuming.

Information extraction — or automatically classifying data items stored as plain text — is thus a major topic of artificial-intelligence research. Last week, at the Association for Computational Linguistics’ Conference on Empirical Methods on Natural Language Processing, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory won a best-paper award for a new approach to information extraction that turns conventional machine learning on its head.

Most machine-learning systems work by combing through training examples and looking for patterns that correspond to classifications provided by human annotators. For instance, humans might label parts of speech in a set of texts, and the machine-learning system will try to identify patterns that resolve ambiguities — for instance, when “her” is a direct object and when it’s an adjective.

Typically, computer scientists will try to feed their machine-learning systems as much training data as possible. That generally increases the chances that a system will be able to handle difficult problems.

In their new paper, by contrast, the MIT researchers train their system on scanty data — because in the scenario they’re investigating, that’s usually all that’s available. But then they find the limited information an easy problem to solve.

“In information extraction, traditionally, in natural-language processing, you are given an article and you need to do whatever it takes to extract correctly from this article,” says Regina Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science and senior author on the new paper. “That’s very different from what you or I would do. When you’re reading an article that you can’t understand, you’re going to go on the web and find one that you can understand.”

Venture capitalists gather to discuss

Surviving breast cancer changed the course of Regina Barzilay’s research. The experience showed her, in stark relief, that oncologists and their patients lack tools for data-driven decision making. That includes what treatments to recommend, but also whether a patient’s sample even warrants a cancer diagnosis, she explained at the Nov. 10 Machine Intelligence Summit, organized by MIT and venture capital firm Pillar.

“We do more machine learning when we decide on Amazon which lipstick you would buy,” said Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science at MIT. “But not if you were deciding whether you should get treated for cancer.”

Barzilay now studies how smarter computing can help patients. She wields the powerful predictive approach called machine learning, a technique that allows computers, given enough data and training, to pick out patterns on their own — sometimes even beyond what humans are capable of pinpointing.

Machine learning has long been vaunted in consumer contexts — Apple’s Siri can talk with us because machine learning enables her to understand natural human speech — yet the summit gave a glimpse of the approach’s much broader potential. Its reach could offer not only better Siris (e.g., Amazon’s “Alexa”), but improved health care and government policies.

Machine intelligence is “absolutely going to revolutionize our lives,” said Pillar co-founder Jamie Goldstein ’89. Goldstein and Anantha Chandrakasan, head of the MIT Department of Electrical Engineering and Computer Science (EECS) and the Vannevar Bush Professor of Electrical Engineering and Computer Science, organized the conference to bring together industry leaders, venture capitalists, students, and faculty from the Computer Science and Artificial Intelligence (CSAIL), Institute for Data, Systems, and Society (IDSS), and the Laboratory for Information and Decision Systems (LIDS) to discuss real-world problems and machine learning solutions.

Barzilay is already thinking along those lines. Her group’s work aims to help doctors and patients make more informed medical decisions with machine learning. She has a vision for the future patient in the oncologist’s office: “If you’re taking this treatment, [you’ll see] how your chances are going to be changed.”

Machine senses

Machine learning has already proven powerful. But Antonio Torralba, professor of electrical engineering and computer science, believes that machines can learn faster, and thereby do more. His team’s approach mimics the way humans learn in infancy. “We just start playing with things and seeing how they feel,” Torralba said. To illustrate, he showed the room a video of a baby turning over squeaky bubble wrap in her hands. Importantly, we notice the noises things make when we move them around, he said.

Powerful than previously realized

Quantum computers promise huge speedups on some computational problems because they harness a strange physical property called entanglement, in which the physical state of one tiny particle depends on measurements made of another. In quantum computers, entanglement is a computational resource, roughly like a chip’s clock cycles — kilohertz, megahertz, gigahertz — and memory in a conventional computer.

In a recent paper in the journal Proceedings of the National Academy of Sciences, researchers at MIT and IBM’s Thomas J. Watson Research Center show that simple systems of quantum particles exhibit exponentially more entanglement than was previously believed. That means that quantum computers — or other quantum information devices — powerful enough to be of practical use could be closer than we thought.

Where ordinary computers deal in bits of information, quantum computers deal in quantum bits, or qubits. Previously, researchers believed that in a certain class of simple quantum systems, the degree of entanglement was, at best, proportional to the logarithm of the number of qubits.

“For models that satisfy certain physical-reasonability criteria — i.e., they’re not too contrived; they’re something that you could in principle realize in the lab — people thought that a factor of the log of the system size was the best you can do,” says Ramis Movassagh, a researcher at Watson and one of the paper’s two co-authors. “What we proved is that the entanglement scales as the square root of the system size. Which is really exponentially more.”

That means that a 10,000-qubit quantum computer could exhibit about 10 times as much entanglement as previously thought. And that difference increases exponentially as more qubits are added.

Reproduces aspects of human neurology

MIT researchers and their colleagues have developed a new computational model of the human brain’s face-recognition mechanism that seems to capture aspects of human neurology that previous models have missed.

The researchers designed a machine-learning system that implemented their model, and they trained it to recognize particular faces by feeding it a battery of sample images. They found that the trained system included an intermediate processing step that represented a face’s degree of rotation — say, 45 degrees from center — but not the direction — left or right.

This property wasn’t built into the system; it emerged spontaneously from the training process. But it duplicates an experimentally observed feature of the primate face-processing mechanism. The researchers consider this an indication that their system and the brain are doing something similar.

“This is not a proof that we understand what’s going on,” says Tomaso Poggio, a professor of brain and cognitive sciences at MIT and director of the Center for Brains, Minds, and Machines (CBMM), a multi-institution research consortium funded by the National Science Foundation and headquartered at MIT. “Models are kind of cartoons of reality, especially in biology. So I would be surprised if things turn out to be this simple. But I think it’s strong evidence that we are on the right track.”

Indeed, the researchers’ new paper includes a mathematical proof that the particular type of machine-learning system they use, which was intended to offer what Poggio calls a “biologically plausible” model of the nervous system, will inevitably yield intermediary representations that are indifferent to angle of rotation.

Poggio, who is also a primary investigator at MIT’s McGovern Institute for Brain Research, is the senior author on a paper describing the new work, which appeared today in the journal Computational Biology. He’s joined on the paper by several other members of both the CBMM and the McGovern Institute: first author Joel Leibo, a researcher at Google DeepMind, who earned his PhD in brain and cognitive sciences from MIT with Poggio as his advisor; Qianli Liao, an MIT graduate student in electrical engineering and computer science; Fabio Anselmi, a postdoc in the IIT@MIT Laboratory for Computational and Statistical Learning, a joint venture of MIT and the Italian Institute of Technology; and Winrich Freiwald, an associate professor at the Rockefeller University.

Fabricate drones with a wide range

This fall’s new Federal Aviation Administration regulations have made drone flight easier than ever for both companies and consumers. But what if the drones out on the market aren’t exactly what you want?

A new system from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) is the first to allow users to design, simulate, and build their own custom drone. Users can change the size, shape, and structure of their drone based on the specific needs they have for payload, cost, flight time, battery usage, and other factors.

To demonstrate, researchers created a range of unusual-looking drones, including a five-rotor “pentacopter” and a rabbit-shaped “bunnycopter” with propellers of different sizes and rotors of different heights.

“This system opens up new possibilities for how drones look and function,” says MIT Professor Wojciech Matusik, who oversaw the project in CSAIL’s Computational Fabrication Group. “It’s no longer a one-size-fits-all approach for people who want to make and use drones for particular purposes.”

The interface lets users design drones with different propellers, rotors, and rods. It also provides guarantees that the drones it fabricates can take off, hover and land — which is no simple task considering the intricate technical trade-offs associated with drone weight, shape, and control.

“For example, adding more rotors generally lets you carry more weight, but you also need to think about how to balance the drone to make sure it doesn’t tip,” says PhD student Tao Du, who was first author on a related paper about the system. “Irregularly-shaped drones are very difficult to stabilize, which means that they require establishing very complex control parameters.”

Du and Matusik co-authored a paper with PhD student Adriana Schulz, postdoc Bo Zhu, and Assistant Professor Bernd Bickel of IST Austria. It will be presented next week at the annual SIGGRAPH Asia conference in Macao, China.

Today’s commercial drones only come in a small range of options, typically with an even number of rotors and upward-facing propellers. But there are many emerging use cases for other kinds of drones. For example, having an odd number of rotors might create a clearer view for a drone’s camera, or allow the drone to carry objects with unusual shapes.

Designing these less conventional drones, however, often requires expertise in multiple disciplines, including control systems, fabrication, and electronics.

Machine learning system

In recent years, computers have gotten remarkably good at recognizing speech and images: Think of the dictation software on most cellphones, or the algorithms that automatically identify people in photos posted to Facebook.

But recognition of natural sounds — such as crowds cheering or waves crashing — has lagged behind. That’s because most automated recognition systems, whether they process audio or visual information, are the result of machine learning, in which computers search for patterns in huge compendia of training data. Usually, the training data has to be first annotated by hand, which is prohibitively expensive for all but the highest-demand applications.

Sound recognition may be catching up, however, thanks to researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). At the Neural Information Processing Systems conference next week, they will present a sound-recognition system that outperforms its predecessors but didn’t require hand-annotated data during training.

Instead, the researchers trained the system on video. First, existing computer vision systems that recognize scenes and objects categorized the images in the video. The new system then found correlations between those visual categories and natural sounds.

“Computer vision has gotten so good that we can transfer it to other domains,” says Carl Vondrick, an MIT graduate student in electrical engineering and computer science and one of the paper’s two first authors. “We’re capitalizing on the natural synchronization between vision and sound. We scale up with tons of unlabeled video to learn to understand sound.”

Fully automated speech recognition

Speech recognition systems, such as those that convert speech to text on cellphones, are generally the result of machine learning. A computer pores through thousands or even millions of audio files and their transcriptions, and learns which acoustic features correspond to which typed words.

But transcribing recordings is costly, time-consuming work, which has limited speech recognition to a small subset of languages spoken in wealthy nations.

At the Neural Information Processing Systems conference this week, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are presenting a new approach to training speech-recognition systems that doesn’t depend on transcription. Instead, their system analyzes correspondences between images and spoken descriptions of those images, as captured in a large collection of audio recordings. The system then learns which acoustic features of the recordings correlate with which image characteristics.

“The goal of this work is to try to get the machine to learn language more like the way humans do,” says Jim Glass, a senior research scientist at CSAIL and a co-author on the paper describing the new system. “The current methods that people use to train up speech recognizers are very supervised. You get an utterance, and you’re told what’s said. And you do this for a large body of data.

“Big advances have been made — Siri, Google — but it’s expensive to get those annotations, and people have thus focused on, really, the major languages of the world. There are 7,000 languages, and I think less than 2 percent have ASR [automatic speech recognition] capability, and probably nothing is going to be done to address the others. So if you’re trying to think about how technology can be beneficial for society at large, it’s interesting to think about what we need to do to change the current situation. And the approach we’ve been taking through the years is looking at what we can learn with less supervision.”