Smartphone Speech Recognition Is 3X Faster Than Texting

Speech-recognition software is not only three times faster at texting than human typists, it’s also more accurate.

Want to save some time? New research suggests you should be using your smartphone’s speech-recognition software to text, instead of your thumbs.

Researchers at Stanford University recently devised an experiment pitting Chinese tech giant Baidu’s speech recognition software against 32 texters, ages 19 to 32, working with the built-in keyboard on an Apple iPhone. Baidu’s Deep Speech 2 software was not only three times faster than the human typists, it was also more accurate.

The researchers hope this revelation “spurs the development of innovative applications of speech recognition technology,” which has historically gotten a pretty bad rap, often billed as slow and inaccurate.

Prof. Geoffrey Hinton Awarded IEEE Medal For His Work In Artificial Intelligence

Stanford team creates computer vision algorithm that can describe photos

Computers only recently began to get the software needed to discern unknown objects; now machine-learning takes computer vision to the next level with a system that can describe objects and put them into context. Coming soon, better visual search?

BY

Stanford Professor Fei-Fei Li, director of the Stanford Artificial Intelligence Lab, leads work on a computer vision system.

Computer software only recently became smart enough to recognize objects in photographs. Now, Stanford researchers using machine learning have created a system that takes the next step, writing a simple story of what’s happening in any digital image.

“The system can analyze an unknown image and explain it in words and phrases that make sense,” said  Fei-Fei Li, a professor of computer science and director of the Stanford Artificial Intelligence Lab.

“This is an important milestone,” Li said. “It’s the first time we’ve had a computer vision system that could tell a basic story about an unknown image by identifying discrete objects and also putting them into some context.”

Humans, Li said, create mental stories that put what we see into context. “Telling a story about a picture turns out to be a core element of human visual intelligence but so far it has proven very difficult to do this with computer algorithms,” she said.

At the heart of the Stanford system are algorithms that enable the system to improve its accuracy by scanning scene after scene, looking for patterns, and then using the accumulation of previously described scenes to extrapolate what is being depicted in the next unknown image.

Read more of this post

Popular Deep Learning Tools – a review

Deep Learning is the hottest trend now in AI and Machine Learning. We review the popular software for Deep Learning, including Caffe, Cuda-convnet, Deeplearning4j, Pylearn2, Theano, and Torch.

By Ran Bi.

deep-learningDeep Learning is now of the hottest trends in Artificial Intelligence and Machine Learning, with daily reports of amazing new achievements, like doing better than humans on IQ test.

In 2015 KDnuggets Software Poll, a new category for Deep Learning Tools was added, with most popular tools in that poll listed below.

  • Pylearn2 (55 users)
  • Theano (50)
  • Caffe (29)
  • Torch (27)
  • Cuda-convnet (17)
  • Deeplearning4j (12)
  • Other Deep Learning Tools (106)

I haven’t used all of them, so this is a brief summary of these popular tools based on their homepages and tutorials.

Theano & Pylearn2:

Theano and Pylearn2 are both developed at University of Montreal with most developers in the LISA group led by Yoshua Bengio. Theano is a Python library, and you can also consider it as a mathematical expression compiler. It is good for making algorithms from scratch. Here is an intuitive example of Theano training.

If we want to use standard algorithms, we can write Pylearn2 plugins as Theano expressions, and Theano will optimize and stabilize the expressions. It includes all things needed for multilayer perceptron/RBM/Stacked Denoting Autoencoder/ConvNets. Here is a quick start tutorial to walk you through some basic ideas on Pylearn2.

Caffe:

Caffe is developed by the Berkeley Vision and Learning Center, created by Yangqing Jia and led by Evan Shelhamer. It is a fast and readable implementation of ConvNets in C++. As shown on its official page, Caffe can process over 60M images per day with a single NVIDIA K40 GPU with AlexNet. It can be used like a toolkit for image classification, while not for other deep learning application such as text or speech.

Torch & OverFeat:

cudnnTorch is written in Lua, and used at NYU, Facebook AI lab and Google DeepMind. It claims to provide a MATLAB-like environment for machine learning algorithms. Why did they choose Lua/LuaJIT instead of the more popular Python? They said in Torch7 paper that “Lua is easily to be integrated with C so within a few hours’ work, any C or C++ library can become a Lua library.” With Lua written in pure ANSI C, it can be easily compiled for arbitrary targets.

OverFeat is a feature extractor trained on the ImageNet dataset with Torch7 and also easy to start with.

Cuda: Read more of this post

MarI/O – Machine Learning for Video Games