Information Blog: be

Showing posts with label be. Show all posts

A Billion Words Because todays language modeling standard should be higher

Posted by gilogo at 12:27 PM Labels: a, be, because, billion, computer, higher, language, modeling, should, standard, todays, words

Posted by Dave Orr, Product Manager, and Ciprian Chelba, Research Scientist

Language is chock full of ambiguity, and it can turn up in surprising places. Many words are hard to tell apart without context: most Americans pronounce “ladder” and “latter” identically, for instance. Keyboard inputs on mobile devices have a similar problem, especially for IME keyboards. For example, the input patterns for “Yankees” and “takes” look very similar:

Photo credit: Kurt Partridge

But in this context -- the previous two words, “New York” -- “Yankees” is much more likely.

One key way computers use context is with language models. These are used for predictive keyboards, but also speech recognition, machine translation, spelling correction, query suggestions, and so on. Often those are specialized: word order for queries versus web pages can be very different. Either way, having an accurate language model with wide coverage drives the quality of all these applications.

Due to interactions between components, one thing that can be tricky when evaluating the quality of such complex systems is error attribution. Good engineering practice is to evaluate the quality of each module separately, including the language model. We believe that the field could benefit from a large, standard set with benchmarks for easy comparison and experiments with new modeling techniques.

To that end, we are releasing scripts that convert a set of public data into a language model consisting of over a billion words, with standardized training and test splits, described in an arXiv paper. Along with the scripts, we’re releasing the processed data in one convenient location, along with the training and test data. This will make it much easier for the research community to quickly reproduce results, and we hope will speed up progress on these tasks.

The benchmark scripts and data are freely available, and can be found here: http://www.statmt.org/lm-benchmark/

The field needs a new and better standard benchmark. Currently, researchers report from a set of their choice, and results are very hard to reproduce because of a lack of a standard in preprocessing. We hope that this will solve both those problems, and become the standard benchmark for language modeling experiments. As more researchers use the new benchmark, comparisons will be easier and more accurate, and progress will be faster.

For all the researchers out there, try out this model, run your experiments, and let us know how it goes -- or publish, and we’ll enjoy finding your results at conferences and in journals.

Robotic musicians in New Zealand

Posted by gilogo at 11:39 AM Labels: banned, be, computer, in, killer, musicians, new, robotic, robots, should, zealand

Radio New Zealand National recently broadcast an interview with Prof. Dale Carnegie the head of Victoria Universitys School of Engineering and Computer Science. Amongst other research his group have been collaborating with the School of Music to design and build robotic musicians that play specially designed instruments. The really interesting thing is that these "musicians" are not limited by human physiology; they can play faster than a person, they can play chords a human hand couldnt physically span and of course they can play longer. You can see a video of a robotic bass performing a cover of Hysteria by Muse. I have some friends who are bass players who will love/hate this!

from The Universal Machine http://universal-machine.blogspot.com/

Put the internet to work for you.

Delete or edit this Recipe

IFTTT Recipe: Log a map of your location connects do-button to evernote

Recommended for you

Should killer robots be banned

Posted by gilogo at 4:43 PM Labels: banned, be, computer, killer, robots, should

Lethal autonomous weapons systems, or "killer robots" as the public prefer to call them, are almost a reality. In fact in certain cases, such as Israels Iron Dome rocket defence system, they already exist. Should the ability of a robot to identify a target and execute an attack without human intervention be outlawed? Many people believe it should, arguing that a robot can never act morally, whilst others argue that in certain circumstances robots may be less dangerous than frightened, stressed and fatigued soldiers. A week long meeting at the UN in Geneva is currently considering the issue. The UK government has already declared that it opposes an international ban on developing "killer robots" as described in this article in the Guardian. An international coalition of NGOs called the Campaign to Stop Killer Robots is lobbying to have a ban established before the technology is upon us. What do you think?

from The Universal Machine http://universal-machine.blogspot.com/

Put the internet to work for you.

Delete or edit this Recipe

IFTTT Recipe: silent phone connects ios-location to ios-notifications

Recommended for you

What of STEM Should Be Computer Science

Posted by gilogo at 5:18 AM Labels: be, computer, of, science, should, stem, what

We keep hearing in the media how many job vacancies there are for computer scientists and how the critical shortage is restricting the growth of many companies. Every one is agreed that we need more people with computing skills. This article from code.org provides an interesting insight into this skills shortage and offers some solutions.

from The Universal Machine http://universal-machine.blogspot.com/

Put the internet to work for you.

Turn off or edit this Recipe

IFTTT Recipe: RSS to G+ Page via Buffer connects feed to buffer

Recommended for you

Should your car be programmed to kill you

Posted by gilogo at 10:23 PM Labels: be, car, computer, kill, programmed, should, to, you, your

Imagine this scenario; you are in your new driverless car and a situation arises (doesnt matter how) where the driver (the computer) has to decide between crashing into a group of young school children, probably killing several, or slamming the car into a wall and probably killing the passenger, i.e. you! Simple ethics would recommend taking a least harm approach, but that means maybe killing you. Would you buy a car programmed to kill you? Or would you prefer to buy one that would make the less ethical choice and always seek to protect the cars occupants. These ethical dilemmas are coming to the fore with the advent of autonomous systems. Several years ago the UKs Royal Academy of Engineers published a report on the ethics of emerging technologies and autonomous systems. More recently MIT Technology Review posted a piece titled Why Self-Driving Cars Must Be Programmed to Kill. My colleague, Paul Ralph, also just gave a radio interview on this subject.

from The Universal Machine http://universal-machine.blogspot.com/

Put the internet to work for you.

Turn off or edit this Recipe

IFTTT Recipe: Add to a note in Evernote connects do-note to evernote

Recommended for you

Information Blog

A Billion Words Because todays language modeling standard should be higher

Robotic musicians in New Zealand

Should killer robots be banned

What of STEM Should Be Computer Science

Should your car be programmed to kill you

Search

Archive