Computer Science Department
CS491E/791E: Computer Vision (Spring 2004)
Meets: TR: 2:30pm - 3:45pm at SEM 326
Instructor:
Dr. George Bebis
- Email:
bebis@cs.unr.edu
- Phone:
(775) 784-6463
- Office:
235 SEM
- Office Hours: TR noon - 1:30pm
Text (required):
Optional Texts:
- R. Jain et. al Machine Vision McGraw Hill, 1995
- Forsyth and Ponce Computer Vision - A modern approach Prentice Hall, 2002.
- Shapiro and Stockman Computer Vision Prentice Hall, 2001.
- E. Davies Machine Vision, Academic Press, 1996.
- V. Nalwa A Guided Tour of Computer Vision, Addison-Wesley, 1993.
- S. Umbaugh Computer Vision and Image Processing: A Practical Approach Using CVIPtools, Prentice Hall, 1997.
- J. Parker Practical Computer Vision, John Wiley & Sons, 1996.
- R. Haralick and L. Shapiro Computer and Robot Vision (vol I and II)
- M. Sonka et. al Image Processing, Analysis, and Machine Vision, Brooks/Cole Pub Co., 1998.
Other books (available in the DeLaMare library)
- Horn, Berthold Robot vision, McGraw-Hill, 1986.
- Baxes, Gregory Digital image processing: principles and applications, Wiley, 1994.
- Ballard, Dana Computer vision, Prentice-Hall, 1982.
- Bow, Sing-Tze Pattern recognition and image preprocessing, New York: M. Dekker, 1992.
- Hall, Ernest Computer image processing and recognition, New York: Academic Press, 1979.
- R. Gonzalez et. al Digital Image Proc
essingAddison-Wesley, 1993.
- K. Castleman Digital Image Processing Prentice-Ha
ll, 1996.
- J. RussImage Processing Handbook,
CRC Press, 1998.
Major CV Journals:
Major CV Conferences:
Computer Vision resources
Useful links to Neural Networks and Genetic Algorithms resources
Useful links to Pattern Recognition resources
Useful links to Algorithms resources
Useful Mathematics, Statistics, and Geometry resources
Course Description
The goal of computer vision is to develop the theoretical and algorithmic basis
by which useful information about the world can be automatically extracted and
analyzed from an observed image, image set, or image sequence. Since images are
two-dimensional projections of the three-dimensional world, the information is
not directly available and must be recovered. This is a very difficult problem
given that the inversion is a many-to-one mapping. To recover the information,
knowledge about the objects in the scene and projection geometry is required.
Computer Vision systems have many potential applications. Robots who can
see are more likely to interact with the real world in a satisfactory manner;
they can choose objects, avoid obstacles, plan routes, calculate their velocity
and orientation, identify dangerous situations, etc. Such robots will be useful
in exploring dangerous or very distant environments (e.g. other planets,
inside nuclear reactors). Computer Vision can be used to help cameras follow
the trajectory of people and vehicles, for example for traffic monitoring;
it can help in the identification of faces for security clearance; it can
be used for converting 2D images into 3D models that can then be rotated
and manipulated, for example to present medical or sporting images from a
better angle; they can be use for inspecting medical images for
identifying tumours and other ailments.
Over the next decade, it is anticipated that Computer Vision systems will
become commonplace, and that vision technology will be applied across a broad
range of business and consumer products. This implies that there will be strong
industry demand for computer vision engineers - for people who understand vision
technology and know how to apply it in real-world problems. This is course will
cover the fundamentals of Computer Vision. It is suited for mainly students who
are interested in doing research in the area of Computer Vision. For graduate
students, there are many open problems in this area suitable for investigation
leading to a Master thesis or a Ph.D. dissertation.
Course Outline (tenative)
- Introduction to Computer Vision (Trucco, Chapt 1)
- Image Processing Review
- Image Formation (Nawla, Chapt 2)
- Camera Parameters (Trucco, Chapt 2)
- Camera Calibration (Trucco, Chapt 6)
- Stereo (Trucco, Chapt 7)
- Motion (Trucco, Chapt 8)
- Shape from shading and from texture (Trucco, Chapt 9)
- Recognition (Trucco, Chapt 10)
- Applications
Exams and Assignments
Exams: There will be a midterm and a final exam.
Homework: Homework problems will be assigned on a regular basis and will
be collected at the beginning of the class on the due date. Solution sets will
be provided for all problems assigned.
Programming Assignments: there will about 4-5 programming assignments
which should be done on an individual basis. For each programming assignment,
you are to turn in a brief report which should include a description of the
problem, a description of your approach, and your evaluation of the results.
Details of the deliverables will be given for each assignment respectively.
Discussion of the programming assignments is allowed and encouraged. However,
each student should do his/her own work. Assignments which are too similar will
receive a zero.
Paper presentation: each student would be required to present a paper
to the rest of the class. The presentations should be professional as if it
were presented in a formal conference (i.e., slides/projector). More details
will be provided in the class.
Course Prerequisites
The pre-requisite for this course are CS308 (Data Structures) and CS474/674
(Image Processing and Interpretation), however, I
will waive the CS474/674 requirement depending on your background and
interests. Good programming skills and mathematical background are essential.
C++ information
Software
- Xv
xv(1) is an interactive image display program for the X window system that is useful for displaying and editing images in a variety of formats.
- CVIPtools
GUI-based computer vision and image processing tools, ANSI-C source code and libraries for Windows95/NT and UNIX, extended computer imaging TCL shell. Also contains an extended Tcl shell with all the computer imaging functions. ANSI-C source code and libraries for image analysis, image compression, image enhancement, image restoration, and many imaging utilities. It has been installed on the Suns in the
/image directory. To run it, first add the following into
your .cshrc file:
setenv CVIPHOME /image/CVIPtools
setenv CVIP_IMGPATH ./
setenv CVIP_DISPLAY picture
setenv TCL_LIBRARY /image/CVIPtools/CVIPTCL/lib/tcl7.6
setenv TK_LIBRARY /image/CVIPtools/CVIPTCL/lib/tk4.2
setenv XF_LOAD_PATH /image/CVIPtools/CVIPTCL/GUI_SCRIPTS
set path=($path /image/CVIPtools /image/CVIPtools/CVIPTCL /image/CVIPtools/bin)
Then enter: source .cshrc to update your settings. You can run CVIPtools by entering CVIPtools
Gimp
The Gimp is the GNU image manipulation program.
Intel Computer Vision Library (OpenCV)
Image processing and computer vision algorithms optimized to run on Intel microprocessors.
Matlab
Matlab(1) is a numeric computation and visualization environment. The image processing and signal processing toolboxes are especially useful. See also: Matlib Tutorial (Univ Utah), Matlab Basics (RPI), Matlab Primer (200K postscript; 25 pages).
Microsoft Vision SDK Library
A library for writing image processing and computer vision programs on Microsoft Windows machines.
Xforms
A GUI toolkit based on Xlib for X Window System- Xforms has been installed on the Suns in /usr/local/src/xforms and on the Linux boxes in LME 314 (see
some more examples. An updated version of the manual is here).
How to capture an MPEG file and
store the frames into separate files
How to make MPEG movies
More software .... (Good stuff !!)
Ghostview(software for viewing postscript files)
Gnuplot(a command-driven interactive function plotting program)
ResearchIndex (a scientific literature digital library - find papers easy !)
Syllabus
Sample Midterm Exam and Study Guide
Image-Processing-Related Material
Source Code
Reading Assignments
Lectures
Generalized Hough Transform
Deformable Contours (Snakes)
- Study the paper: Williams, Donna and Shah, Mubarak. "A Fast Algorithm for Active Contours and Curvature Estimation", CVGIP: Image Understanding. Vol. 55, No. 1, January 1992. pp. 14-26.
Edge Contour Representation
Region Based Segmentation
Thresholding
Region Growing
- Study the paper: Besl and Jain. "Segmentation Through Variable-Order Surface Fitting", IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 10, no. 2, pp. 167-192, 1988.
Region Splitting and Merging
ConnectedComponents
Region Representation
Corner Detection
Read the review material on Linear Algebra from the Mathematical Methods for Computer Vision page.
2D Geometric Transformations
3D Geometric Transformations
Singular Value Decomposition
Image Formation
Perspective Projection
Geometric Camera Parameters
Camera Calibration
- Study the paper: Z. Zhang. "A flexible new technique for camera calibration" IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol. 22, no. 11, pp. 1330-1334, 2000 (also, click here).
Stereo Camera
Stereo Correspondence Problem
Study the paper: T. Kanade and M. Okutomi. "A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment" IEEE International Conference on Robotics and Automation, pp. 1088-1095, 1991.
Epipolar Geometry
Stereo Reconstruction
Homework
Programming Assignments
- The set of face images can be found here.
- The code to compute the 1D Gaussian mask can be found here.
- The code to solve overdetermined systems of equations using Singular Value Decomposition (SVD) can be found here here.
- See an example of how to call a C function from C++ here
Prog. Assignment 2 (due on: 3/23/04)
Prog. Assignment 3 (due on: 4/27/04)
- Click here to find the calibration
data
-
Information about the meaning of the files and OpenCV's calibration procedure can be found here
- svdcmp.c (there is a copy of the book "Numerical Recipes" in the lab)
- OpenCV F.A.Q. and manual
Prog. Assignment 4 (due on: 5/13/04)
Paper Presentations
Please, email me your top 3 choices by Tuesday, May 29th (2:30pm). If I do not
hear from you by that time, I will assign a paper to you.
Oral Presentation Advice
REPRESENTATION
SEGMENTATION
- Chad Carson, Serge Belongie, Hayit Greenspan and Jitendra Malik,"Blobworld: Color- and Texture-Based Image Segmentation Using EM and Its Application to Image Querying and Classification", IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8):1026-1038, August 2002. (assigned to Javier Martinez, May 13, 1:00-2:00PM)
- J. Shi and J. Malik, "Normalized Cuts and Image Segmentation", IEEE Transactions on Pattern Analysis and Machine Intellige, 22(8), August, 2000, pp. 888-905.
- Mahamud, S.; Williams, L.R.; Thornber, K.K.; Kanglin Xu, "Segmentation of multiple salient closed contours from real images" , IEEE Transactions on Pattern Analysis and Machine Intellige, 25(4), pp. 433- 444, 2003.
- D. Jacobs, "Robust and Efficient Detection of Convex Groups", IEEE Transactions on Pattern Analysis and Machine Intelligence, (18)1, pp. 23-37, 1996. (assigned to Junmei Wang, May 12, 2:00-3:00 PM)
CAMERA CALIBRATION
MOTION
- C. Tomasi and T. Kanade, "Shape and motion from image streams under orthography - a factorization method", Technical Report TR-92-1270, Cornell University, March 1992.
- A. Verri and T. Poggio, "Motion Field and Optical Flow: Qualitative Properties", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 5, pp. 490-498, 1989.
- Wang, J.Y.A.; Adelson, E.H., "Representing moving images with layers", IEEE Transactions on Image Processing, vol. 3, no. 5, pp. 625-638, 1994. (assigned to Sohei Okamoto, May 13, 3-4PM)
TRACKING
- Jianbo Shi, Carlo Tomasi, "Good Features to Track", IEEE Conference on Computer Vision and Pattern Recognition (CVPR'94) (assigned to Yan Tong, May 13, 11:00-noon)
- Dieter Koller, Joseph Weber, Jitendra Malik, "Robust Multiple Car Tracking with Occlusion Reasoning", Technical Report UCB/CSD 93/780, Computer Science Division (EECS), University of California, Berkeley, 1993. (assigned to Kai She - May 13, 2003, 2:00-3:00pm)
APPLICATIONS
- Kanade, T.; Rander, P.; Narayanan, P.J.,"Virtualized reality: constructing virtual worlds from real scenes", IEEE Multimedia Magazine, Vol. 4, No. 1, pp. 34-47, 1997. (assigned to Beifang Yi, May 12, 11:00-noon)
- Richard Szeliski,"Video Mosaics for Virtual Environments", IEEE Computer Graphics and Applications, vol. 16, no. 2, pp. 22-30, 1996. (assigned to Mehmet Eser, May 12, 10:00-11:00AM)
- M. Irani and S. Peleg,"Improving Resolution by Image Registration", CVGIP: Graphical Models and Image Processing, vol. 53, no. 3, pp. 231-239,
1991.
Department of Computer Science, University of Nevada, Reno, NV 89557
Page created and maintained by:
Dr. George Bebis
(bebis@cs.unr.edu)