Computer Science Department

CS491E/791E: Computer Vision (Spring 2004)

Meets: TR: 2:30pm - 3:45pm at SEM 326

Instructor: Dr. George Bebis

Email: bebis@cs.unr.edu
Phone: (775) 784-6463
Office: 235 SEM
Office Hours: TR noon - 1:30pm

Text (required):

Introductory Techniques for 3-D Computer Vision by Emanuele Trucco, Alessandro Verri, Prentice Hall, 1998.

Optional Texts:

R. Jain et. al Machine Vision McGraw Hill, 1995
Forsyth and Ponce Computer Vision - A modern approach Prentice Hall, 2002.
Shapiro and Stockman Computer Vision Prentice Hall, 2001.
E. Davies Machine Vision, Academic Press, 1996.
V. Nalwa A Guided Tour of Computer Vision, Addison-Wesley, 1993.
S. Umbaugh Computer Vision and Image Processing: A Practical Approach Using CVIPtools, Prentice Hall, 1997.
J. Parker Practical Computer Vision, John Wiley & Sons, 1996.
R. Haralick and L. Shapiro Computer and Robot Vision (vol I and II)
M. Sonka et. al Image Processing, Analysis, and Machine Vision, Brooks/Cole Pub Co., 1998.

Other books (available in the DeLaMare library)

Horn, Berthold Robot vision, McGraw-Hill, 1986.

Baxes, Gregory Digital image processing: principles and applications, Wiley, 1994.

Ballard, Dana Computer vision, Prentice-Hall, 1982.

Bow, Sing-Tze Pattern recognition and image preprocessing, New York: M. Dekker, 1992.

Hall, Ernest Computer image processing and recognition, New York: Academic Press, 1979.

R. Gonzalez et. al Digital Image Proc essingAddison-Wesley, 1993.

K. Castleman Digital Image Processing Prentice-Ha ll, 1996.

J. RussImage Processing Handbook, CRC Press, 1998.

Major CV Journals:

IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)
Computer Vision and Image Understanding (CVIU)
International Journal of Computer Vision (IJCV)
Image and Vision Computing (IVC)
IEEE Transactions on Image Processing (TIP)
Pattern Recognition (PR)
Machine Vision and Applications (MVA)

Major CV Conferences:

International Conference on Computer Vision (ICCV)
Computer Vision and Pattern Recognition (CVPR)
International Conference of Pattern Recognition (ICPR)
International Conference of Image Processing (ICIP)
Other computer vision conferences

Computer Vision resources

Computer Vision Home Page
Computer Vision Industry
UNR Computer Vision Resources
UNR Computer Vision Laboratory (CVL)
UNR Combined Research-Curriculum Development in Computer Vision (CRCD)
Summer Research In Computer Vision

Useful links to Neural Networks and Genetic Algorithms resources

Neural Network Home Page

Neural Networks Conferences
Genetic Algorithms Archive

Useful links to Pattern Recognition resources

Useful links to Algorithms resources

The Stony Brook Algorithm Repository

Useful Mathematics, Statistics, and Geometry resources

Course Description

The goal of computer vision is to develop the theoretical and algorithmic basis by which useful information about the world can be automatically extracted and analyzed from an observed image, image set, or image sequence. Since images are two-dimensional projections of the three-dimensional world, the information is not directly available and must be recovered. This is a very difficult problem given that the inversion is a many-to-one mapping. To recover the information, knowledge about the objects in the scene and projection geometry is required.

Computer Vision systems have many potential applications. Robots who can see are more likely to interact with the real world in a satisfactory manner; they can choose objects, avoid obstacles, plan routes, calculate their velocity and orientation, identify dangerous situations, etc. Such robots will be useful in exploring dangerous or very distant environments (e.g. other planets, inside nuclear reactors). Computer Vision can be used to help cameras follow the trajectory of people and vehicles, for example for traffic monitoring; it can help in the identification of faces for security clearance; it can be used for converting 2D images into 3D models that can then be rotated and manipulated, for example to present medical or sporting images from a better angle; they can be use for inspecting medical images for identifying tumours and other ailments.

Over the next decade, it is anticipated that Computer Vision systems will become commonplace, and that vision technology will be applied across a broad range of business and consumer products. This implies that there will be strong industry demand for computer vision engineers - for people who understand vision technology and know how to apply it in real-world problems. This is course will cover the fundamentals of Computer Vision. It is suited for mainly students who are interested in doing research in the area of Computer Vision. For graduate students, there are many open problems in this area suitable for investigation leading to a Master thesis or a Ph.D. dissertation.

Course Outline (tenative)

Introduction to Computer Vision (Trucco, Chapt 1)
Image Processing Review
Image Formation (Nawla, Chapt 2)
Camera Parameters (Trucco, Chapt 2)
Camera Calibration (Trucco, Chapt 6)
Stereo (Trucco, Chapt 7)
Motion (Trucco, Chapt 8)
Shape from shading and from texture (Trucco, Chapt 9)
Recognition (Trucco, Chapt 10)
Applications

Exams and Assignments

Exams: There will be a midterm and a final exam.

Homework: Homework problems will be assigned on a regular basis and will be collected at the beginning of the class on the due date. Solution sets will be provided for all problems assigned.

Programming Assignments: there will about 4-5 programming assignments which should be done on an individual basis. For each programming assignment, you are to turn in a brief report which should include a description of the problem, a description of your approach, and your evaluation of the results. Details of the deliverables will be given for each assignment respectively. Discussion of the programming assignments is allowed and encouraged. However, each student should do his/her own work. Assignments which are too similar will receive a zero.

Paper presentation: each student would be required to present a paper to the rest of the class. The presentations should be professional as if it were presented in a formal conference (i.e., slides/projector). More details will be provided in the class.

Course Prerequisites

The pre-requisite for this course are CS308 (Data Structures) and CS474/674 (Image Processing and Interpretation), however, I will waive the CS474/674 requirement depending on your background and interests. Good programming skills and mathematical background are essential.

C++ information

C++ Coding Standard (also: how to mix C and C++)
C++ Archive

Software

Xv
xv(1) is an interactive image display program for the X window system that is useful for displaying and editing images in a variety of formats.
CVIPtools
GUI-based computer vision and image processing tools, ANSI-C source code and libraries for Windows95/NT and UNIX, extended computer imaging TCL shell. Also contains an extended Tcl shell with all the computer imaging functions. ANSI-C source code and libraries for image analysis, image compression, image enhancement, image restoration, and many imaging utilities. It has been installed on the Suns in the /image directory. To run it, first add the following into your .cshrc file:
Then enter: source .cshrc to update your settings. You can run CVIPtools by entering CVIPtools
Gimp
The Gimp is the GNU image manipulation program.
Intel Computer Vision Library (OpenCV)
Image processing and computer vision algorithms optimized to run on Intel microprocessors.
Matlab
Matlab(1) is a numeric computation and visualization environment. The image processing and signal processing toolboxes are especially useful. See also: Matlib Tutorial (Univ Utah), Matlab Basics (RPI), Matlab Primer (200K postscript; 25 pages).
Microsoft Vision SDK Library
A library for writing image processing and computer vision programs on Microsoft Windows machines.
Xforms
A GUI toolkit based on Xlib for X Window System- Xforms has been installed on the Suns in /usr/local/src/xforms and on the Linux boxes in LME 314 (see some more examples. An updated version of the manual is here).
How to capture an MPEG file and store the frames into separate files
How to make MPEG movies
More software .... (Good stuff !!)

Ghostview(software for viewing postscript files)

Gnuplot(a command-driven interactive function plotting program)

ResearchIndex (a scientific literature digital library - find papers easy !)

Syllabus

Spring 2004 syllabus (pdf)

Sample Midterm Exam and Study Guide

Spring 2002 postscript or pdf
Midterm exam study guide pdf
Final exam study guide pdf

Image-Processing-Related Material

How are images represented in the computer?
Image Processing Fundamentals (uncompress using "gunzip")
PGM Image Format
Image Formats and Viewers
More on Image Formats....

Source Code

C++ routine to read a PGM image from a file: ReadImage.cpp
C++ routine to read a PPM (color) image from a file:ReadImage.cpp
C++ routine to write a PGM image to a file: WriteImage.cpp
C++ routine to write a PPM (color) image to a file: WriteImage.cpp
An example: Threshold.cpp
OpenCV: example

Reading Assignments

What is Computer Vision?
Computers Seeing People (by Irfan Essa, AI Magazine, 1999)
Looking at People: Sensing for Ubiquitous and Wearable Computing (by Alex Pentland, PAMI 2000)

Lectures

Computer Vision Review (powerpoint)
Image Processing Review (powerpoint)
Point Processing
Frame Processing
Geometric Processing
Area Processing
Image Segmentation
Edge Detection
Edge Contour Extraction

Line Detection
Study the paper: R. Duda and P. Hart, "Use of the Hough Transformation to Detect Lines and Curves in Pictures", Graphics and Image Processing, vol 15, pp. 11-15, January 1972.
Generalized Hough Transform
Deformable Contours (Snakes)
- Study the paper: Williams, Donna and Shah, Mubarak. "A Fast Algorithm for Active Contours and Curvature Estimation", CVGIP: Image Understanding. Vol. 55, No. 1, January 1992. pp. 14-26.
Edge Contour Representation
Region Based Segmentation
Thresholding
Region Growing
- Study the paper: Besl and Jain. "Segmentation Through Variable-Order Surface Fitting", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 10, no. 2, pp. 167-192, 1988.
Region Splitting and Merging
ConnectedComponents
Region Representation
Corner Detection

Read the review material on Linear Algebra from the Mathematical Methods for Computer Vision page.
2D Geometric Transformations
3D Geometric Transformations
Singular Value Decomposition
Image Formation
Perspective Projection
Geometric Camera Parameters
Camera Calibration
- Study the paper: Z. Zhang. "A flexible new technique for camera calibration" IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1330-1334, 2000 (also, click here).
Stereo Camera
Stereo Correspondence Problem
- Study the paper: T. Kanade and M. Okutomi. "A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment" IEEE International Conference on Robotics and Automation, pp. 1088-1095, 1991.
Epipolar Geometry
Stereo Reconstruction

Homework

Homework 1 (due on: 2/10/04)
Homework 2 (due on: 2/24/04)
Homework 3 (due on: 3/23/04)
Homework 4 (due on: 4/22/04)
Homework 5 (due on: 4/29/04)

Programming Assignments

Prog. Assignment 1 (due on: 2/26/04)
- The set of face images can be found here.
- The code to compute the 1D Gaussian mask can be found here.
- The code to solve overdetermined systems of equations using Singular Value Decomposition (SVD) can be found here here.
- See an example of how to call a C function from C++ here
Prog. Assignment 2 (due on: 3/23/04)
- Use lena.pgm and o1r1.pgm to test the Canny edge detector.
- Use images from Image Gallery 2 to test your Hough Transform. Also, make sure you use the following images overlap4.pgm, overlap5.pgm, and overlap6.pgm (coins overlapping)
- Use square.pgm, square_noise.pgm, geor.0004.pgm, and geor.0063.pgm, and model1.pgm for testing the snake approach.
Prog. Assignment 3 (due on: 4/27/04)
- Click here to find the calibration data
- Information about the meaning of the files and OpenCV's calibration procedure can be found here
- svdcmp.c (there is a copy of the book "Numerical Recipes" in the lab)
- OpenCV F.A.Q. and manual
Prog. Assignment 4 (due on: 5/13/04)
- R. Hartley, "In Defense of the Eight-Point Algorithm", IEEE Transactions on Pattern Analysis and Machine Intelligence 19(6): 580-593 (1997).
- Click here to find the data you need to use in your experiments.

Paper Presentations

Oral Presentation Advice

REPRESENTATION

P. Burt and E. Adelson, "The Laplacian Pyramid as a Compact Image Code", IEEE Transactions on Communications, 31(4), 1983, pp. 532-540. (assigned to Saurabh Singh, May 13, 10:00-11:00AM)

SEGMENTATION

Chad Carson, Serge Belongie, Hayit Greenspan and Jitendra Malik,"Blobworld: Color- and Texture-Based Image Segmentation Using EM and Its Application to Image Querying and Classification", IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8):1026-1038, August 2002. (assigned to Javier Martinez, May 13, 1:00-2:00PM)
J. Shi and J. Malik, "Normalized Cuts and Image Segmentation", IEEE Transactions on Pattern Analysis and Machine Intellige, 22(8), August, 2000, pp. 888-905.
Mahamud, S.; Williams, L.R.; Thornber, K.K.; Kanglin Xu, "Segmentation of multiple salient closed contours from real images" , IEEE Transactions on Pattern Analysis and Machine Intellige, 25(4), pp. 433- 444, 2003.
D. Jacobs, "Robust and Efficient Detection of Convex Groups", IEEE Transactions on Pattern Analysis and Machine Intelligence, (18)1, pp. 23-37, 1996. (assigned to Junmei Wang, May 12, 2:00-3:00 PM)

CAMERA CALIBRATION

Heikkila, J,"Geometric camera calibration using circular control points", IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(10), pp. 1066- 1077, 2000.

MOTION

C. Tomasi and T. Kanade, "Shape and motion from image streams under orthography - a factorization method", Technical Report TR-92-1270, Cornell University, March 1992.
A. Verri and T. Poggio, "Motion Field and Optical Flow: Qualitative Properties", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 5, pp. 490-498, 1989.
Wang, J.Y.A.; Adelson, E.H., "Representing moving images with layers", IEEE Transactions on Image Processing, vol. 3, no. 5, pp. 625-638, 1994. (assigned to Sohei Okamoto, May 13, 3-4PM)

TRACKING

Jianbo Shi, Carlo Tomasi, "Good Features to Track", IEEE Conference on Computer Vision and Pattern Recognition (CVPR'94) (assigned to Yan Tong, May 13, 11:00-noon)
Dieter Koller, Joseph Weber, Jitendra Malik, "Robust Multiple Car Tracking with Occlusion Reasoning", Technical Report UCB/CSD 93/780, Computer Science Division (EECS), University of California, Berkeley, 1993. (assigned to Kai She - May 13, 2003, 2:00-3:00pm)

APPLICATIONS

Kanade, T.; Rander, P.; Narayanan, P.J.,"Virtualized reality: constructing virtual worlds from real scenes", IEEE Multimedia Magazine, Vol. 4, No. 1, pp. 34-47, 1997. (assigned to Beifang Yi, May 12, 11:00-noon)
Richard Szeliski,"Video Mosaics for Virtual Environments", IEEE Computer Graphics and Applications, vol. 16, no. 2, pp. 22-30, 1996. (assigned to Mehmet Eser, May 12, 10:00-11:00AM)
M. Irani and S. Peleg,"Improving Resolution by Image Registration", CVGIP: Graphical Models and Image Processing, vol. 53, no. 3, pp. 231-239, 1991.

Department of Computer Science, University of Nevada, Reno, NV 89557
Page created and maintained by: Dr. George Bebis (bebis@cs.unr.edu)