Showing posts with label camera. Show all posts
Showing posts with label camera. Show all posts

Classifying everything using your RPi Camera Deep Learning with the Pi

For those who dont want to read, the code can be found on my github with a readme:
https://github.com/StevenHickson/RPi_CaffeQuery
You can also read about it on my Hackaday io page here.

What is object classification?

Object classification has been a very popular topic the past couple years. Given an image, we want a computer to be able to tell us what that image is showing. The newest trend has been using convolutional neural networks in order to classify networks trained with a large amount of data.

One of the bigger frameworks for this is the Caffe framework. For more on this see the Caffe home page.
You can test out there web demo here. It isnt great at people but it is very good at cats, dogs, objects, and activities.


Why is this useful?

There are all kinds of autonomous tasks you can do with the RPi camera. Perhaps you want to know if your dog is in your living room, so the Pi can take his/her picture or tell him/her they are a good dog. Perhaps you want your RPi to recognize whether there is fruit in your fruit drawer so it can order you more when it is empty. The possibilities are endless.

How do convolutional neural networks work (a VERY simple overview)?

Convolutional neural networks are based loosely off how the human brain works. They are built of layers of many neurons that are "activated" by certain inputs. The input layer is connected in a network through a series of interconnected neurons in hidden layers like so:
[1]

Each neuron sends its signal to any other neuron it is connected to which is then multiplied by the connection weight and run through a sigmoid function. The training of the network is done by changing the weights in order to minimize the error function based on a set of inputs with a known set of outputs using back propagation.

How do we get this on the Pi?

Well I went ahead and compiled Caffe on the RPi. Unfortunately since it doesnt have code to optimize the network with its GPU, the classification takes ~20-25s per image, which is far too much.
Note: I did find a different optimized CNN network for the RPi by Pete Warden here. It looks great but it still takes about 3 seconds per image, which still doesnt seem fast  enough. 

You will also need the Raspberry Pi camera which you can get from here:
Raspberry PI 5MP Camera Board Module

A better option: Using the web demo with python

So we can take advantage of the Caffe web demo and use that to reduce the processing time even further. With this method, the image classification takes ~1.5s, which is usable for a system.

How does the code work?

We make a symbolic link from /dev/shm/images/ to our /var/www for apache and forward our router port 5050 to the Pi port 80. 
Then we use raspistill to take an image and save it to memory as /dev/shm/images/test.jpg. Since this is symlinked in /var/www, we should be able to see it at http://YOUR-EXTERNAL-IP:5005/images/test.jpg.
Then we use grab to qull up the Caffe demo framework with our image and get the classification results. This is done in queryCNN.py which gets the results.

What does the output look like?

Given a picture of some of my Pi components, I get this, which is pretty accurate:

Where can I get the code?

https://github.com/StevenHickson/RPi_CaffeQuery

[1] http://white.stanford.edu/teach/index.php/An_Introduction_to_Convolutional_Neural_Networks

Consider donating to further my tinkering since I do all this and help people out for free.



Places you can find me
Read More..

Getting your fridge to order food for you with a RPi camera and a hacked up Instacart API

This is a detailed post on how to get your fridge to autonomously order fruit for you when you are low.  An RPi takes a picture every day and detects if you have fruit or not using my Caffe web query code. If your fridge is low on fruit, it orders fruit using Instacart, which is then delivered to your house. You can find the code with a walk through here:
https://github.com/StevenHickson/AutonomousFridge

Some of my posts are things I end up using every day and some are proof of concepts that I think are interesting. This is one of the latter. When I was younger, I heard an urban legend that Bill Gates had a fridge that ordered food for him and delivered it same-day whenever he was low. That story always intrigued me and I finally decided to implement a proof of concept of it. Below is how I set about doing this.

Hacking up an Instacart API

The first thing we need is a service that picks out food and delivers it to you. There are many of these, but as I live in Atlanta, I chose Instacart. Now we need an API. Unfortunately, Instacart doesnt provide one, so we will need to make our own. 

Head over to instacart.com and set up an account and login. Then right click and view source. You are looking for a line in the source like this:
FirebaseUrl="https://instacart.firebaseio.com/carts/SOME_HASH_STRING_HERE

That string is what you need to access your instacart account. Open up a terminal and type:
curl https://instacart.firebaseio.com/carts/YOUR_HASH_STRING.json

You should get back a response that looks like this:
{"checkout_state":{"workflow_state":"shopping"},"items":{"1069829":{"created_at":1.409336316211E9,"qty":1,"user_id":YOUR_USER_ID}},"users":{"-JXAzAp6rgtM4u2dV2tI":{"id":YOUR_USER_ID"name":"StevenH"},"-Jj2_kFsu5hvZRhx4KX1":{"id":YOUR_USER_ID,"name":"Steven H"},"-Jp8VvDusSDOyEiJ0J5D":{"id":YOUR_USER_ID,"name":"Steven H"}}}

Now we just need to figure out what different items are. Pick a store and start adding items to your cart and run the same command. If I add some fruit (oranges, bananas, strawberries, pears) to my cart and then run the same curl request. I get something like this:
{"checkout_state":{"workflow_state":"shopping"},"items":{"1069829":{"created_at":1.409336316211E9,"qty":1,"user_id":YOUR_USER_ID},"8182033":{"created_at":1.431448385824E9,"qty":2,"user_id":YOUR_USER_ID},"8583398":{"created_at":1.431448413452E9,"qty":3,"user_id":YOUR_USER_ID},"8585519":{"created_at":1.431448355207E9,"qty":3,"user_id":YOUR_USER_ID},"8601780":{"created_at":1.424915467829E9,"qty":3,"user_id":YOUR_USER_ID},"8602830":{"created_at":1.43144840911E9,"qty":1,"user_id":YOUR_USER_ID}},"users":{"-JXAzAp6rgtM4u2dV2tI":{"id":22232545,"name":"StevenH"},"-Jj2_kFsu5hvZRhx4KX1":{"id":YOUR_USER_ID,"name":"Steven H"},"-Jp8VvDusSDOyEiJ0J5D":{"id":YOUR_USER_ID,"name":"Steven H"}}}

Now empty your cart and we will make sure we can add all those things to your cart with a curl request. Take your response from earlier, and use it in the following line:
curl -X PATCH -d YOUR_FULL_CART_RESPONSE https://instacart.firebaseio.com/carts/YOUR_HASH_STRING.json

Now, your cart should be full of fruit again. Now we just need a way to recognize whether your fridge has fruit or not.

Detecting fruit in your fridge

For this we just need a Raspberry Pi 2 Model B Project Board - 1GB RAM - 900 MHz Quad-Core CPU and a Raspberry PI 5MP Camera Board Module.
Set up your camera following these instructions and you will be ready to go. Set up your camera module in your fridge (or wherever you store your fruit).

We are going to use the Caffe framework for recognizing whether fruit is in the refrigerator drawer or not. You can read about how to do that here.
We are going to set this up similarly. Run the following commands to set things up:

git clone https://github.com/StevenHickson/AutonomousFridge.git
sudo apt-get install python python-pycurl python-lxml python-pip
sudo pip install grab sudo apt-get install apache2
mkdir -p /dev/shm/images
sudo ln -s /dev/shm/images /var/www/images

Then you must forward your router from port 5005 to port 80 on the Pi
Now you can edit test.sh with your info and run ./test.sh
Or add the following line to cron with crontab -e:
00 17 * * * /home/pi/AutonomousFridge/test.sh

This script takes a picture with raspistill and puts it in a symlinked directory in memory accessible from port 80. Then it sends that URL to the Caffe web demo and gets the result.
The Caffe demo shows how well it classifies the existence of fruit as shown below:



The end result of this is a script that runs every day at 5 pm. When your fridge doesnt have fruit, it adds a bunch of fruit to your Instacart cart. You can order it at your leisure to make sure you are home when it arrives. You could also use my PiAUISuite to get it to text you about your fruit status. It can be alot of fun to make a proof of concept of an old urban legend.

Consider donating to further my tinkering since I do all this and help people out for free.



Places you can find me
Read More..

HDR Low Light and High Dynamic Range photography in the Google Camera App



As anybody who has tried to use a smartphone to photograph a dimly lit scene knows, the resulting pictures are often blurry or full of random variations in brightness from pixel to pixel, known as image noise. Equally frustrating are smartphone photographs of scenes where there is a large range of brightness levels, such as a family photo backlit by a bright sky. In high dynamic range (HDR) situations like this, photographs will either come out with an overexposed sky (turning it white) or an underexposed family (turning them into silhouettes).

HDR+ is a feature in the Google Camera app for Nexus 5 and Nexus 6 that uses computational photography to help you take better pictures in these common situations. When you press the shutter button, HDR+ actually captures a rapid burst of pictures, then quickly combines them into one. This improves results in both low-light and high dynamic range situations. Below we delve into each case and describe how HDR+ works to produce a better picture.

Capturing low-light scenes

The camera on a smartphone has a small lens, meaning that it doesnt gather much light. If a scene is dimly lit, the resulting photograph will contain image noise. One solution is to lengthen the exposure time - how long the sensor chip collects light. This reduces noise, but since its hard to hold a smartphone perfectly steady, long exposures have the unwanted side effect of blurring the shot. Devices with optical image stabilization (OIS) sense this "camera shake” and shift the lens rapidly to compensate. This allows longer exposures with less blur, but it can’t help with really dark scenes.

HDR+ addresses this problem by taking a burst of shots with short exposure times, aligning them algorithmically, and replacing each pixel with the average color at that position across all the shots. Averaging multiple shots reduces noise, and using short exposures reduces blur. HDR+ also begins the alignment process by choosing the sharpest single shot from the burst. Astronomers call this lucky imaging, a technique used to reduce the blurring of images caused by Earths shimmering atmosphere.
A low light example is captured at dusk. The picture at left was taken with HDR+ off and the picture at right with HDR+ on. The HDR+ image is brighter, cleaner, and sharper, with much more detail seen in the subject’s hair and eyelashes. Photos by Florian Kainz
Capturing high dynamic range scenes

Another limitation of smartphone cameras is that their sensor chips have small pixels. This limits the cameras dynamic range, which refers to the span between the brightest highlight that doesnt blow out (turn white) and the darkest shadow that doesnt look black. One solution is to capture a sequence of pictures with different exposure times (sometimes called bracketing), then align and blend the images together. Unfortunately, bracketing causes parts of the long-exposure image to blow out and parts of the short-exposure image to be noisy. This makes alignment hard, leading to ghosts, double images, and other artifacts.

However, bracketing is not actually necessary; one can use the same exposure time in every shot. By using a short exposure HDR+ avoids blowing out highlights, and by combining enough shots it reduces noise in the shadows. This enables the software to boost the brightness of shadows, saving both the subject and the sky, as shown in the example below. And since all the shots look similar, alignment is robust; you won’t see ghosts or double images in HDR+ images, as one sometimes sees with other HDR software.
A classic high dynamic range situation. With HDR+ off (left), the camera exposes for the subjects’ faces, causing the landscape and sky to blow out. With HDR+ on (right), the picture successfully captures the subjects, the landscape, and the sky. Photos by Ryan Geiss
Our last example illustrates all three of the problems we’ve talked about - high dynamic range, low light, and camera shake. With HDR+ off, a photo of Princeton University Chapel (shown below) taken with Nexus 6 chooses a relatively long 1/12 second exposure. Although optical image stabilization reduces camera shake, this is a long time to hold a camera still, so the image is slightly blurry. Since the scene was very dark, the walls are noisy despite the long exposure. Therefore, strong denoising is applied, causing smearing (below, left inset image). Finally, because the scene also has high dynamic range, the window at the end of the nave is blown out (below, right inset image), and the side arches are lost in darkness.
Click here to see the full resolution image. Photo by Marc Levoy
HDR+ mode performs better on all three problems, as seen in the image below: the chandelier at left is cleaner and sharper, the window is no longer blown out, there is more detail in the side arches, and since a burst of shots are captured and the software begins alignment by choosing the sharpest shot in the burst (lucky imaging), the resulting picture is sharp.
Click here to see the full resolution image. Photo by Marc Levoy
Heres an album containing these comparisons and others as high-resolution images. For each scene in the album there is a pair of images captured by Nexus 6; the first was was taken with HDR+ off, and the second with HDR+ on.

Tips on using HDR+

Capturing a burst in HDR+ mode takes between 1/3 second and 1 second, depending on how dark the scene is. During this time youll see a circle animating on the screen (left image below). Try to hold still until it finishes. The combining step also takes time, so if you scroll to the camera roll right after taking the shot, youll see a thumbnail image and a progress bar (right image below). When the bar reaches 100%, your HDR+ picture is ready.
Should you leave HDR+ mode on? We do. The only times we turn it off are for fast-moving sports, because HDR+ pictures take longer to capture than a single shot, or for scenes that are so dark we need the flash. But before you turn off HDR+ for these action shots or super-dark scenes, give it a try; we think youll be surprised how well it works!

At this time HDR+ is available only on Nexus 5 and Nexus 6, as part of the Google Camera app.

Read More..

Lens Blur in the new Google Camera app



One of the biggest advantages of SLR cameras over camera phones is the ability to achieve shallow depth of field and bokeh effects. Shallow depth of field makes the object of interest "pop" by bringing the foreground into focus and de-emphasizing the background. Achieving this optical effect has traditionally required a big lens and aperture, and therefore hasn’t been possible using the camera on your mobile phone or tablet.

That all changes with Lens Blur, a new mode in the Google Camera app. It lets you take a photo with a shallow depth of field using just your Android phone or tablet. Unlike a regular photo, Lens Blur lets you change the point or level of focus after the photo is taken. You can choose to make any object come into focus simply by tapping on it in the image. By changing the depth-of-field slider, you can simulate different aperture sizes, to achieve bokeh effects ranging from subtle to surreal (e.g., tilt-shift). The new image is rendered instantly, allowing you to see your changes in real time.

Lens Blur replaces the need for a large optical system with algorithms that simulate a larger lens and aperture. Instead of capturing a single photo, you move the camera in an upward sweep to capture a whole series of frames. From these photos, Lens Blur uses computer vision algorithms to create a 3D model of the world, estimating the depth (distance) to every point in the scene. Here’s an example -- on the left is a raw input photo, in the middle is a “depth map” where darker things are close and lighter things are far away, and on the right is the result blurred by distance:

Here’s how we do it. First, we pick out visual features in the scene and track them over time, across the series of images. Using computer vision algorithms known as Structure-from-Motion (SfM) and bundle adjustment, we compute the camera’s 3D position and orientation and the 3D positions of all those image features throughout the series.

Once we’ve got the 3D pose of each photo, we compute the depth of each pixel in the reference photo using Multi-View Stereo (MVS) algorithms. MVS works the way human stereo vision does: given the location of the same object in two different images, we can triangulate the 3D position of the object and compute the distance to it. How do we figure out which pixel in one image corresponds to a pixel in another image? MVS measures how similar they are -- on mobile devices, one particularly simple and efficient way is computing the Sum of Absolute Differences (SAD) of the RGB colors of the two pixels.

Now it’s an optimization problem: we try to build a depth map where all the corresponding pixels are most similar to each other. But that’s typically not a well-posed optimization problem -- you can get the same similarity score for different depth maps. To address this ambiguity, the optimization also incorporates assumptions about the 3D geometry of a scene, called a "prior,” that favors reasonable solutions. For example, you can often assume two pixels near each other are at a similar depth. Finally, we use Markov Random Field inference methods to solve the optimization problem.

Having computed the depth map, we can re-render the photo, blurring pixels by differing amounts depending on the pixel’s depth, aperture and location relative to the focal plane. The focal plane determines which pixels to blur, with the amount of blur increasing proportionally with the distance of each pixel to that focal plane. This is all achieved by simulating a physical lens using the thin lens approximation.

The algorithms used to create the 3D photo run entirely on the mobile device, and are closely related to the computer vision algorithms used in 3D mapping features like Google Maps Photo Tours and Google Earth. We hope you have fun with your bokeh experiments!
Read More..