Information Blog: google

Showing posts with label google. Show all posts

Announcing the Google MOOC Focused Research Awards

Posted by gilogo at 9:24 AM Labels: announcing, awards, computer, focused, google, mooc, research, the

Posted by Maggie Johnson, Director of Education and University Relations, and Aimin Zhu, University Relations Manager, APAC

Last year, Google and Tsinghua University hosted the 2014 APAC MOOC Focused Faculty Workshop, an event designed to share, brainstorm and generate ideas aimed at fostering MOOC innovation. As a result of the ideas generated at the workshop, we solicited proposals from the attendees for research collaborations that would advance important topics in MOOC development.

After expert reviews and committee discussions, we are pleased to announce the following recipients of the MOOC Focused Research Awards. These awards cover research exploring new interactions to enhance learning experience, personalized learning, online community building, interoperability of online learning platforms and education accessibility:

“MOOC Visual Analytics” - Michael Ginda, Indiana University, United States
“Improvement of students’ interaction in MOOCs using participative networks” - Pedro A. Pernías Peco, Universidad de Alicante, Spain
“Automated Analysis of MOOC Discussion Content to Support Personalised Learning” - Katrina Falkner, The University of Adelaide, Australia
“Extending the Offline Capability of Spoken Tutorial Methodology” - Kannan Moudgalya, Indian Institute of Technology Bombay, India
“Launching the Pan Pacific ISTP (Information Science and Technology Program) through MOOCs” - Yasushi Kodama, Hosei University, Japan
“Fostering Engagement and Social Learning with Incentive Schemes and Gamification Elements in MOOCs” - Thomas Schildhauer, Alexander von Humboldt Institute for Internet and Society, Germany
“Reusability Measurement and Social Community Analysis from MOOC Content Users” - Timothy K. Shih, National Central University, Taiwan

In order to further support these projects and foster collaboration, we have begun pairing the award recipients with Googlers pursuing online education research as well as product development teams.

Google is committed to supporting innovation in online learning at scale, and we congratulate the recipients of the MOOC Focused Research Awards. It is our belief that these collaborations will further develop the potential of online education, and we are very pleased to work with these researchers to jointly push the frontier of MOOCs.

NIPS 2015 and Machine Learning Research at Google

Posted by gilogo at 12:57 AM Labels: 2015, and, at, computer, google, learning, machine, nips, research

Posted by Sanjiv Kumar, Research Scientist

This week, Montreal hosts the 29^th Annual Conference on Neural Information Processing Systems (NIPS 2015), a machine learning and computational neuroscience conference that includes invited talks, demonstrations and oral and poster presentations of some of the latest in machine learning research. Google will have a strong presence at NIPS 2015, with over 140 Googlers attending in order to contribute to and learn from the broader academic research community by presenting technical talks and posters, in addition to hosting workshops and tutorials.

Research at Google is at the forefront of innovation in Machine Intelligence, actively exploring virtually all aspects of machine learning including classical algorithms as well as cutting-edge techniques such as deep learning. Focusing on both theory as well as application, much of our work on language understanding, speech, translation, visual processing, ranking, and prediction relies on Machine Intelligence. In all of those tasks and many others, we gather large volumes of direct or indirect evidence of relationships of interest, and develop learning approaches to understand and generalize.

If you are attending NIPS 2015, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about our research being presented at NIPS 2015 in the list below (Googlers highlighted in blue).

Google is a Platinum Sponsor of NIPS 2015.

PROGRAM ORGANIZERS
General Chairs
Corinna Cortes, Neil D. Lawrence
Program Committee includes:
Samy Bengio, Gal Chechik, Ian Goodfellow, Shakir Mohamed, Ilya Sutskever

ORAL SESSIONS
Learning Theory and Algorithms for Forecasting Non-stationary Time Series
Vitaly Kuznetsov, Mehryar Mohri

SPOTLIGHT SESSIONS
Distributed Submodular Cover: Succinctly Summarizing Massive Data
Baharan Mirzasoleiman, Amin Karbasi, Ashwinkumar Badanidiyuru, Andreas Krause

Spatial Transformer Networks
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu

Pointer Networks
Oriol Vinyals, Meire Fortunato, Navdeep Jaitly

Structured Transforms for Small-Footprint Deep Learning
Vikas Sindhwani, Tara Sainath, Sanjiv Kumar

Spherical Random Features for Polynomial Kernels
Jeffrey Pennington, Felix Yu, Sanjiv Kumar

POSTERS
Learning to Transduce with Unbounded Memory
Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Phil Blunsom

Deep Knowledge Tracing
Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas Guibas, Jascha Sohl-Dickstein

Hidden Technical Debt in Machine Learning Systems
D Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, Dan Dennison

Grammar as a Foreign Language
Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton

Stochastic Variational Information Maximisation
Shakir Mohamed, Danilo Rezende

Embedding Inference for Structured Multilabel Prediction
Farzaneh Mirzazadeh, Siamak Ravanbakhsh, Bing Xu, Nan Ding, Dale Schuurmans

On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators
Changyou Chen, Nan Ding, Lawrence Carin

Spectral Norm Regularization of Orthonormal Representations for Graph Transduction
Rakesh Shivanna, Bibaswan Chatterjee, Raman Sankaran, Chiranjib Bhattacharyya, Francis Bach

Differentially Private Learning of Structured Discrete Distributions
Ilias Diakonikolas, Moritz Hardt, Ludwig Schmidt

Nearly Optimal Private LASSO
Kunal Talwar, Li Zhang, Abhradeep Thakurta

Learning Continuous Control Policies by Stochastic Value Gradients
Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Tom Erez, Yuval Tassa

Gradient Estimation Using Stochastic Computation Graphs
John Schulman, Nicolas Heess, Theophane Weber, Pieter Abbeel

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer

Teaching Machines to Read and Comprehend
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom

Bayesian dark knowledge
Anoop Korattikara, Vivek Rathod, Kevin Murphy, Max Welling

Generalization in Adaptive Data Analysis and Holdout Reuse
Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, Aaron Roth

Semi-supervised Sequence Learning
Andrew Dai, Quoc Le

Natural Neural Networks
Guillaume Desjardins, Karen Simonyan, Razvan Pascanu, Koray Kavukcuoglu

Revenue Optimization against Strategic Buyers
Andres Munoz Medina, Mehryar Mohri

WORKSHOPS
Feature Extraction: Modern Questions and Challenges
Workshop Chairs include: Dmitry Storcheus, Afshin Rostamizadeh, Sanjiv Kumar
Program Committee includes: Jeffery Pennington, Vikas Sindhwani

NIPS Time Series Workshop
Invited Speakers include: Mehryar Mohri
Panelists include: Corinna Cortes

Nonparametric Methods for Large Scale Representation Learning
Invited Speakers include: Amr Ahmed

Machine Learning for Spoken Language Understanding and Interaction
Invited Speakers include: Larry Heck

Adaptive Data Analysis
Organizers include: Moritz Hardt

Deep Reinforcement Learning
Organizers include : David Silver
Invited Speakers include: Sergey Levine

Advances in Approximate Bayesian Inference
Organizers include : Shakir Mohamed
Panelists include: Danilo Rezende

Cognitive Computation: Integrating Neural and Symbolic Approaches
Invited Speakers include: Ramanathan V. Guha, Geoffrey Hinton, Greg Wayne

Transfer and Multi-Task Learning: Trends and New Perspectives
Invited Speakers include: Mehryar Mohri
Poster presentations include: Andres Munoz Medina

Learning and privacy with incomplete data and weak supervision
Organizers include : Felix Yu
Program Committee includes: Alexander Blocker, Krzysztof Choromanski, Sanjiv Kumar
Speakers include: Nando de Freitas

Black Box Learning and Inference
Organizers include : Ali Eslami
Keynotes include: Geoff Hinton

Quantum Machine Learning
Invited Speakers include: Hartmut Neven

Bayesian Nonparametrics: The Next Generation
Invited Speakers include: Amr Ahmed

Bayesian Optimization: Scalability and Flexibility
Organizers include: Nando de Freitas

Reasoning, Attention, Memory (RAM)
Invited speakers include: Alex Graves, Ilya Sutskever

Extreme Classification 2015: Multi-class and Multi-label Learning in Extremely Large Label Spaces
Panelists include: Mehryar Mohri, Samy Bengio
Invited speakers include: Samy Bengio

Machine Learning Systems
Invited speakers include: Jeff Dean

SYMPOSIA
Brains, Mind and Machines
Invited Speakers include: Geoffrey Hinton, Demis Hassabis

Deep Learning Symposium
Program Committee Members include: Samy Bengio, Phil Blunsom, Nando De Freitas, Ilya Sutskever, Andrew Zisserman
Invited Speakers include: Max Jaderberg, Sergey Ioffe, Alexander Graves

Algorithms Among Us: The Societal Impacts of Machine Learning
Panelists include: Shane Legg

TUTORIALS
NIPS 2015 Deep Learning Tutorial
Geoffrey E. Hinton, Yoshua Bengio, Yann LeCun

Large-Scale Distributed Systems for Training Neural Networks
Jeff Dean, Oriol Vinyals

Largest collection of Google Logos on the web Set 10

Posted by gilogo at 5:38 PM Labels: 10, collection, computer, google, largest, logos, of, on, set, the, web

Set1 Set2 Set3 Set4 Set5 Set6 Set7 Set8 Set9 Set10

Set1 Set2 Set3 Set4 Set5 Set6 Set7 Set8 Set9 Set10

A Comparison of Five Google Online Courses

Posted by gilogo at 10:38 PM Labels: a, comparison, computer, courses, five, google, of, online

Posted by Julia Wilkowski, Senior Instructional Designer

Google has taught five open online courses in the past year, reaching nearly 400,000 interested students. In this post I will share observations from experiments with a year’s worth of these courses. We were particularly surprised by how the size of our courses evolved during the year; how students responded to a non-linear, problem-based MOOC; and the value that many students got out of the courses, even after the courses ended.

Observation #1: Course size
We have seen varying numbers of registered students in the courses. Our first two courses (Power Searching versions one and two) garnered significant interest with over 100,000 students registering for each course. Our more recent courses have attracted closer to 40,000 students each. It’s likely that this is a result of initial interest in MOOCs starting to decline as well as students realizing that online courses require significant commitment of time and effort. We’d like other MOOC content aggregators to share their results so that we can identify overall MOOC patterns.

*based on surveys sent only to course completers. Other satisfaction scores represent aggregate survey results sent to all registrants.

Observation #2: Completion rates
Comparing these five two-week courses, we notice that most of them illustrate a completion rate (measured by the number of students who meet the course criteria for completion divided by the total number of registrants) of between 11-16%. Advanced Power Searching was an outlier at only 4%. Why? A possible answer can be found by comparing the culminating projects for each course: Power Searching consisted of students completing a multiple choice test; Advanced Power Searching students completed case studies of applying skills to research problems. After grading their work, students also had to solve a final search challenge.

Advanced Power Searching also differed from all of the other courses in the way it presented content and activities. Power Searching offered videos and activities in a highly structured, linear path; Advanced Power Searching presented students with a selection of challenges followed by supporting lessons. We observed a decreasing number of views on each challenge page similar to the pattern in the linear course (see figure 1).

Figure 1. Unique page views for Power Searching and Advanced Power Searching

Students who did complete Advanced Power Searching expressed satisfaction with the course (95% of course completing students would recommend the course to others, compared with 94% of survey respondents from Power Searching). We surmise that the lower completion rate for Advanced Power Searching compared to Power Searching could be a result of the relative difficulty of this course (it assumed significantly more foundational knowledge than Power Searching), the unstructured nature of the course, or a combination of these and other factors.

Even though completion rates seem low when compared with traditional courses, we are excited about the sheer number of students we’ve reached through our courses (over 51,000 earning certificates of completion). If we offered the same content to classrooms of 30 students, it would take over four and a half years of daily classes to teach the same information!

Observation #3: Students have varied goals
We would also like to move the discussion beyond completion rates. We’ve noticed that students register for online courses for many different reasons. In Mapping with Google, we asked students to select a goal during registration. We discovered that

52% of registrants intended to complete the course
48% merely wanted to learn a few new things about Google’s mapping tools

Post-course surveys revealed that

78% of students achieved the goal they defined at registration
89% of students learned new features of Google Maps
76% reported learning new features of Google Earth

Though a much smaller percentage of students completed course requirements, these statistics show that many of the students attained their learning goals.

Observation #4: Continued interest in post-course access
After each course ended, we kept many of the course materials (videos, activities) available. Though we removed access to the forums, final projects/assessments, and teaching assistants, we have seen significant interest in the content as measured by Google and YouTube Analytics. The Power Searching course pages have generated nearly three million page views after the courses finished; viewers have watched over 160,000 hours (18 years!) of course videos. In the two months since Mapping with Google finished, we have seen over 70,000 unique visitors to the course pages.

In all of our courses, we saw a high number of students interested in learning online: 96% of Power Searching participants agreed or strongly agreed that they would take a course in a similar format. We have succeeded in teaching tens of thousands of students to be more savvy users of Google tools. Future posts will take an in-depth look at our experiments with self-graded assessments, community elements that enhance learning, and design elements that influence student success.

Google Computational Journalism Research Awards launch in Europe

Posted by gilogo at 8:01 AM Labels: awards, computational, computer, europe, google, in, journalism, launch, research

Posted by Andrea Held, Google University Relations & Matt Cooke, Google News Lab Europe

Journalism is evolving fast in the digital age, and researchers across Europe are working on exciting projects to create innovative new tools and open source software that will support online journalism and benefit readers. As part of the wider Google Digital News Initiative (DNI), we invited academic researchers across Europe to submit proposals for the Computational Journalism Research Awards.

After careful review by Google’s News Lab and Research teams, the following projects were selected:

SCAN: Systematic Content Analysis of User Comments for Journalists
Walid Maalej, Professor of Informatics, University of Hamburg
Wiebke Loosen, Senior Researcher for Journalism, Hans-Bredow-Institute, Hamburg, Germany
This project aims at developing a framework for the systematic, semi-automated analysis of audience feedback on journalistic content to better reflect the voice of users, mitigate the analysis efforts, and help journalists generate new content from the user comments.

Event Thread Extraction for Viewpoint Analysis
Ioana Manolescu, Senior Researcher, INRIA Saclay, France
Xavier Tannier, Professor of Computer Science, University Paris-Sud, France
The goal of the project is to automatically build topic "event threads" that will help journalists and citizens decode claims made by public figures, in order to distinguish between personal opinion, communication tools and voluntary distortions of the reality.

Computational Support for Creative Story Development by Journalists
Neil Maiden, Professor of Systems Engineering
George Brock, Professor of Journalism, City University London, UK
This project will develop a new software prototype to implement creative search strategies that journalists could use to strengthen investigative storytelling more efficiently than with current news content management and search tools.

We congratulate the recipients of these awards and we look forward to the results of their research. Each award includes funding of up to $60,000 in cash and $20,000 in computing credits on Google’s Cloud Platform. Stay tuned for updates on their progress.

Google Science Fair 2015 what will you try

Posted by gilogo at 7:18 AM Labels: 2015, computer, fair, google, science, try, what, will, you

Posted by Miriam Schneider, Google for Education team

(Cross-posted from the Google for Education Blog)

Science is about observing and experimenting. It’s about exploring unanswered questions, solving problems through curiosity, learning as you go and always trying again.

That’s the spirit behind the fifth annual Google Science Fair, kicking off today. Together with LEGO Education, National Geographic, Scientific American and Virgin Galactic, we’re calling on all young researchers, explorers, builders, technologists and inventors to try something ambitious. Something imaginative, or maybe even unimaginable. Something that might just change the world around us.

From now through May 18, students around the world ages 13-18 can submit projects online across all scientific fields, from biology to computer science to anthropology and everything in between. Prizes include $100,000 in scholarships and classroom grants from Scientific American and Google, a National Geographic Expedition to the Galapagos, an opportunity to visit LEGO designers at their Denmark headquarters, and the chance to tour Virgin Galactic’s new spaceship at their Mojave Air and Spaceport. This year we’re also introducing an award to recognize an Inspiring Educator, as well as a Community Impact Award honoring a project that addresses an environmental or health challenge.

It’s only through trying something that we can get somewhere. Flashlights required batteries, then Ann Makosinski tried the heat of her hand. His grandfather would wander out of bed at night, until Kenneth Shinozuka tried a wearable sensor. The power supply was constantly unstable in her Indian village, so Harine Ravichandran tried to build a different kind of regulator. Previous Science Fair winners have blown us away with their ideas. Now it’s your turn.

Big ideas that have the potential to make a big impact often start from something small. Something that makes you curious. Something you love, you’re good at, and want to try.

So, what will you try?

Time travel with Google Street View

Posted by gilogo at 9:34 PM Labels: computer, google, street, time, travel, view, with

I always knew Google would do this, it was so obvious. Google Street View now has a new feature that lets you go back in time with Street View to see how any location looked right back to when Googles cameras first captured the view. Now when you start Street View youll see a time stamp on the pop-up Street View window and a time-line with a slider. You can select from any of the points on the slider.
Your street, like mine, probably hasnt changed much. Time magazine has put together a series of time-lapse sequences that show iconic buildings like NYCs One World Trade Center rising out of the ground over the years. This feature will be fascinating to explore in 20 to 30 years time.

from The Universal Machine http://universal-machine.blogspot.com/

Put the internet to work for you.

via Personal Recipe 895909

Google voice search faster and more accurate

Posted by gilogo at 5:50 AM Labels: accurate, and, computer, faster, google, more, search, voice

Posted by Ha?im Sak, Andrew Senior, Kanishka Rao, Françoise Beaufays and Johan Schalkwyk – Google Speech Team

Back in 2012, we announced that Google voice search had taken a new turn by adopting Deep Neural Networks (DNNs) as the core technology used to model the sounds of a language. These replaced the 30-year old standard in the industry: the Gaussian Mixture Model (GMM). DNNs were better able to assess which sound a user is producing at every instant in time, and with this they delivered greatly increased speech recognition accuracy.

Today, we’re happy to announce we built even better neural network acoustic models using Connectionist Temporal Classification (CTC) and sequence discriminative training techniques. These models are a special extension of recurrent neural networks (RNNs) that are more accurate, especially in noisy environments, and they are blazingly fast!

In a traditional speech recognizer, the waveform spoken by a user is split into small consecutive slices or “frames” of 10 milliseconds of audio. Each frame is analyzed for its frequency content, and the resulting feature vector is passed through an acoustic model such as a DNN that outputs a probability distribution over all the phonemes (sounds) in the model. A Hidden Markov Model (HMM) helps to impose some temporal structure on this sequence of probability distributions. This is then combined with other knowledge sources such as a Pronunciation Model that links sequences of sounds to valid words in the target language and a Language Model that expresses how likely given word sequences are in that language. The recognizer then reconciles all this information to determine the sentence the user is speaking. If the user speaks the word “museum” for example - /m j u z i @ m/ in phonetic notation - it may be hard to tell where the /j/ sound ends and where the /u/ starts, but in truth the recognizer doesn’t care where exactly that transition happens: All it cares about is that these sounds were spoken.

Our improved acoustic models rely on Recurrent Neural Networks (RNN). RNNs have feedback loops in their topology, allowing them to model temporal dependencies: when the user speaks /u/ in the previous example, their articulatory apparatus is coming from a /j/ sound and from an /m/ sound before. Try saying it out loud - “museum” - it flows very naturally in one breath, and RNNs can capture that. The type of RNN used here is a Long Short-Term Memory (LSTM) RNN which, through memory cells and a sophisticated gating mechanism, memorizes information better than other RNNs. Adopting such models already improved the quality of our recognizer significantly.

The next step was to train the models to recognize phonemes in an utterance without requiring them to make a prediction for each time instant. With Connectionist Temporal Classification, the models are trained to output a sequence of “spikes” that reveals the sequence of sounds in the waveform. They can do this in any way as long as the sequence is correct.

The tricky part though was how to make this happen in real-time. After many iterations, we managed to train streaming, unidirectional, models that consume the incoming audio in larger chunks than conventional models, but do actual computations less often. With this, we drastically reduced computations and made the recognizer much faster. We also added artificial noise and reverberation to the training data, making the recognizer more robust to ambient noise. You can watch a model learning a sentence here.

We now had a faster and more accurate acoustic model and were excited to launch it on real voice traffic. However, we had to solve another problem - the model was delaying its phoneme predictions by about 300 milliseconds: it had just learned it could make better predictions by listening further ahead in the speech signal! This was smart, but it would mean extra latency for our users, which was not acceptable. We solved this problem by training the model to output phoneme predictions much closer to the ground-truth timing of the speech.

The CTC recognizer outputs spikes as it identifies various phonetic units (in various colors) in the input speech signal. The x-axis shows the acoustic input timing for phonemes and y-axis shows the posterior probabilities as predicted by the neural network. The dotted line shows where the model chooses not to output a phoneme.

We are happy to announce that our new acoustic models are now used for voice searches and commands in the Google app (on Android and iOS), and for dictation on Android devices. In addition to requiring much lower computational resources, the new models are more accurate, robust to noise, and faster to respond to voice search queries - so give it a try, and happy (voice) searching!

Google Award Program stimulates Journalism and CS collaboration

Posted by gilogo at 3:10 PM Labels: and, award, collaboration, computer, cs, google, journalism, program, stimulates

Posted by Krishna Bharat, Distinguished Research Scientist

Last fall, Google invited academic researchers to participate in a Computational Journalism awards program focused on the intersection of Computer Science and Journalism. We solicited proposals for original research projects relevant to today’s fast evolving news industry.

As technology continues to shape and be shaped by the media landscape, applicants were asked to rethink traditional models and roles in the ecosystem, and reimagine the lifecycle of the news story in the online world. We encouraged them to develop innovative tools and open source software that could benefit readers and be game-changers for reporters and publishers. Each award includes funding of $60,000 in cash and $20,000 in computing credits on Google’s Cloud Platform.

We congratulate the recipients of these awards, whose projects are described below, and look forward to the results of their research. Stay tuned for updates on their progress.

Larry Birnbaum, Professor of Electrical Engineering and Computer Science, and Journalism, Northwestern University
Project: Thematic Characterization of News Stories
This project aims to develop computational methods for identifying abstract themes or "angles" in news stories, e.g., seeing a story as an instance of "pulling yourself up by your bootstraps," or as a "David vs. Goliath" story. In collaboration with journalism and computer science students, we will develop applications utilizing these methods in the creation, distribution, and consumption of news content.

Irfan Essa, Professor, Georgia Institute of Technology
Project: Tracing Reuse in Political Language
Our goal in this project is to research, and then develop a data-mining tool that allows an online researcher to find and trace language reuse. By language reuse, we specifically mean: Can we find if in a current text some language was used that can be traced back to some other text or script. The technical innovation in this project is aimed at (1) identifying linguistic reuse in documents as well as other forms of material, which can be converted to text, and therefore includes political speeches and videos. Another innovation will be in (2) how linguistic reuse can be traced through the web and online social networks.

Susan McGregor, Assistant Director, Tow Center for Digital Journalism, Columbia Journalism School
Project: InfoScribe
InfoScribe is a collaborative web platform that lets citizens participate in investigative journalism projects by digitizing select data from scanned document sets uploaded by journalists. One of InfoScribes primary research goals is to explore how community participation in journalistic activities can help improve their accuracy, transparency and impact. Additionally, InfoScribe seeks to build and expand upon understandings of how computer vision and statistical inference can be most efficiently combined with human effort in the completion of complex tasks.

Paul Resnick, Professor, University of Michigan School of Information
Project: RumorLens
RumorLens is a tool that will aid journalists in finding posts that spread or correct a particular rumor on Twitter, by exploring the size of the audiences that those posts have reached. In the collection phase, the user provides one or a few exemplar tweets and then manually classifies a few hundred others as spreading the rumor, correcting it, or labeling it as unrelated. This enables automatic retrieval and classification of remaining tweets, which are then presented in an interactive visualization that shows audience sizes.

Ryan Thornburg, Associate Professor, School of Journalism and Mass Communication, University of North Carolina at Chapel Hill
Project: Public Records Dashboard for Small Newsrooms
Building off our Knight News Challenge effort to bring data-driven journalism to readers of rural newspaper websites, we are developing an internal newsroom tool that will alert reporters and editors to potential story tips found in public data. Our project aims to lower the cost of finding in public data sets stories that shine light in dark places, hold powerful people accountable, and explain our increasingly complex and interconnected world. (Public facing site for the data acquisition element of the project at http://open-nc.org)

Voicecommand Update and Fix for Google Speech v2 0

Posted by gilogo at 2:44 PM Labels: 0, and, computer, fix, for, google, speech, update, v2, voicecommand

Sorry it took so long for this. Googles deprecation of the Speech v1.0 api came at the worst possible time for me as it was right before I started driving across the country.

I didnt get to fix it yesterday when I got into San Francisco either because I ended up going to Maker Faire and then hung out at a bar with all the Hackaday people. I met the guy who invented sudo and figured out I have the same watch as Mike Szczys.

Anyways, Ive fixed voicecommand to work again and I fixed a couple of small bugs, including one in the install script which didnt copy the voicecommand config properly.

To get the fix, just run the update script (or reinstall if you are having config file issues).
I detailed how to update here:
http://stevenhickson.blogspot.com/2013/06/installing-and-updating-piauisuite-and.html

Heres a picture from Maker Faire for your trouble!

Consider donating to further my tinkering since I do all this and help people out for free.

Places you can find me

Facebook
Google+
Linkedin
Blogspot
Google-Code
Github
YouTube
Twitter

The neural networks behind Google Voice transcription

Posted by gilogo at 11:51 AM Labels: behind, computer, google, networks, neural, the, transcription, voice

Posted by Françoise Beaufays, Research Scientist

Over the past several years, deep learning has shown remarkable success on some of the world’s most difficult computer science challenges, from image classification and captioning to translation to model visualization techniques. Recently we announced improvements to Google Voice transcription using Long Short-term Memory Recurrent Neural Networks (LSTM RNNs)—yet another place neural networks are improving useful services. We thought we’d give a little more detail on how we did this.

Since it launched in 2009, Google Voice transcription had used Gaussian Mixture Model (GMM) acoustic models, the state of the art in speech recognition for 30+ years. Sophisticated techniques like adapting the models to the speakers voice augmented this relatively simple modeling method.

Then around 2012, Deep Neural Networks (DNNs) revolutionized the field of speech recognition. These multi-layer networks distinguish sounds better than GMMs by using “discriminative training,” differentiating phonetic units instead of modeling each one independently.

But things really improved rapidly with Recurrent Neural Networks (RNNs), and especially LSTM RNNs, first launched in Android’s speech recognizer in May 2012. Compared to DNNs, LSTM RNNs have additional recurrent connections and memory cells that allow them to “remember” the data they’ve seen so far—much as you interpret the words you hear based on previous words in a sentence.

By then, Google’s old voicemail system, still using GMMs, was far behind the new state of the art. So we decided to rebuild it from scratch, taking advantage of the successes demonstrated by LSTM RNNs. But there were some challenges.

An LSTM memory cell, showing the gating mechanisms that allow it to store
and communicate information. Image credit: Alex Graves

There’s more to speech recognition than recognizing individual sounds in the audio: sequences of sounds need to match existing words, and sequences of words should make sense in the language. This is called “language modeling.” Language models are typically trained over very large corpora of text, often orders of magnitude larger than the acoustic data. It’s easy to find lots of text, but not so easy to find sources that match naturally spoken sentences. Shakespeare’s plays in 17th-century English won’t help on voicemails.

We decided to retrain both the acoustic and language models, and to do so using existing voicemails. We already had a small set of voicemails users had donated for research purposes and that we could transcribe for training and testing, but we needed much more data to retrain the language models. So we asked our users to donate their voicemails in bulk, with the assurance that the messages wouldn’t be looked at or listened to by anyone—only to be used by computers running machine learning algorithms. But how does one train models from data that’s never been human-validated or hand-transcribed?

We couldn’t just use our old transcriptions, because they were already tainted with recognition errors—garbage in, garbage out. Instead, we developed a delicate iterative pipeline to retrain the models. Using improved acoustic models, we could recognize existing voicemails offline to get newer, better transcriptions the language models could be retrained on, and with better language models we could recognize again the same data, and repeat the process. Step by step, the recognition error rate dropped, finally settling at roughly half what it was with the original system! That was an excellent surprise.

There were other (not so positive) surprises too. For example, sometimes the recognizer would skip entire audio segments; it felt as if it was falling asleep and waking up a few seconds later. It turned out that the acoustic model would occasionally get into a “bad state” where it would think the user was not speaking anymore and what it heard was just noise, so it stopped outputting words. When we retrained on that same data, we’d think all those spoken sounds should indeed be ignored, reinforcing that the model should do it even more. It took careful tuning to get the recognizer out of that state of mind.

It was also tough to get punctuation right. The old system relied on hand-crafted rules or “grammars,” which, by design, can’t easily take textual context into account. For example, in an early test our algorithms transcribed the audio “I got the message you left me” as “I got the message. You left me.” To try and tackle this, we again tapped into neural networks, teaching an LSTM to insert punctuation at the right spots. It’s still not perfect, but we’re continually working on ways to improve our accuracy.

In speech recognition as in many other complex services, neural networks are rapidly replacing previous technologies. There’s always room for improvement of course, and we’re already working on new types of networks that show even more promise!