Tutorials (5 July 2005)

Morning programme

·        Face Recognition: Past, Present & Future (Moghaddam)

·        Design and Optimization of Adaptive Multimedia Systems (Akella, v.d. Schaar, Gupta)

·        Multimedia Collaboration: Systems & Technologies (Rui)

 

Afternoon programme

·        Human-Centered Multimedia Information Systems (Sebe, Jaimes)

·        Multimedia Processing on Multiprocessor SoC Platforms (Marculescu, Chakraborty, Stravers)

 


 

Face Recognition: Past, Present & Future

 

Instructor:

Baback Moghaddam

Mitsubishi Electric Research Laboratories

Email: baback@merl.com

 

Abstract

This short course provides an historical overview of the field of automatic face recognition (dating back some 20+ years) followed by a (relatively) detailed account of various research paradigms, their origins and significance, the variety of different computational algorithms developed as well as their current use in both research and industry. Being essentially a tutorial survey, the emphasis in this short course will be on breadth of coverage (as opposed to depth in any one particular topic). The ultimate goal is to understand what is so special about human faces, just how hard is face recognition by machines, what are the best ways to tackle various challenges, just what is the “state-of-the-art” and why face recognition will become increasingly common and a vital part of our lives in the coming age of ubiquitous surveillance.

 

Outline

 

-                       feature-based

-                       appearance-based

-                       view-based

-                       2D shape-texture models

-                       3D shape-texture models

-                       Neural Networks

-                       PCA, LDA, Dual PCA (Bayesian)

-                       Fisherface, Multilinear Models (Tensorfaces)

-                       Nonlinear Kernel Methods

-                       3D shape-texture models

-                       Support Vector Machines (SVM)

-                       Databases: MIT, Yale, PIE, AR, M2VTS, HID

-                       FERET (1990s)

-                       FRVT (2002-2005)

-                       FRGC (2005)

-                       Pose & Viewing Geometry

o                            View-Based 2D Models

o                            3D Face Models

-                       Illumination variation

o                            Lambertian Models (Quotient Image)

o                            Illumination Cones, Spherical Harmonics, 9PL

o                            Shadows, Skin Reflectance Models (BRDF)

-                       Non-rigid deformation (expression)

-                       Age, Ethnicity, Makeup, Disguise, etc.

-                       Biometrics & Security

-                       Computer Human Interface

-                       Entertainment & Video Games

-                       Standards: MPEG-7 & MPEG-21

 

Intended audience

The first part of the course provides an introductory overview (survey) of the field that is appropriate for the uninitiated and those who simply want a basic-level understanding of the challenges and methodologies. The remainder of the course, however, will assume graduate-level competence in areas such as basic computer vision, computer graphics, image processing, probability and statistics, pattern recognition and some machine learning.

 

Biographies of the instructor

Baback Moghaddam is a Senior Research Scientist at Mitsubishi Electric Research Laboratories (MERL) in Cambridge MA USA, where he works primarily in the area of computational vision and Bayesian learning. His past research interests were in probabilistic visual modeling, object recognition, facial analysis, statistical learning theory and advanced pattern recognition techniques for biometrics. Prior to coming to MERL, he was at the Vision & Modeling Group at the MIT Media Laboratory where he developed the MIT automatic face recognition system that won the 1996 DARPA “FERET” face recognition competition. He has written numerous papers and book chapters on face recognition, including the core chapter in Springer-Verlag's recent “Handbook of Face Recognition.” Dr. Moghaddam is a senior member of the IEEE and ACM.

 


 

Design and Optimization of Adaptive Multimedia Systems

 

Instructors:

Prof. Venkatesh Akella

Department of Electrical & Computer Engineering

University of California, Davis

One Shields Avenue

Davis, CA 95616-5294

Tel: +1-530-752-9810

Email: akella@ucdavis.edu

URL: http://www.ece.ucdavis.edu/~akella/

 

Prof. Mihaela van der Schaar

Department of Electrical & Computer Engineering

University of California, Davis

One Shields Avenue

Davis, CA 95616-5294

Tel: +1-530-754-6281

Email: mvanderschaar@ece.ucdavis.edu

URL : http://www.ece.ucdavis.edu/~mihaela/

 

Prof. Rajesh Gupta

Professor and Qualcomm Endowed Chair

Department of Computer Science and Engineering

University of California, San Diego,,

AP&M 3111, 9500 Gilman Drive,

La Jolla, CA 92093-0114

Tel:+1-(858)  822-4391

Email: rgupta@ucsd.edu

URL: http://www.cse.ucsd.edu/~gupta/

 

Abstract

The “one size fits all” design and implementation philosophy used for desktops is not appropriate for the deployment of multimedia applications on embedded systems such as smart-phones, PDAs and other consumer-oriented appliances that have limited resources and a wide diversity in resources such as memory, battery-life, processing capability etc. A design methodology that adapts the implementation to different resource constraints while maximizing a user-defined utility function is required.

 

We will present an overview of the state-the-art approaches to multimedia system design on resource-constrained embedded systems using "energy" as an example of a resource that needs to be optimized. We start with an overview of next generation multimedia algorithms like H.264, MPEG-21, their requirements and the challenges posed by emerging applications like surveillance, video conferencing and streaming multimedia. This will set the stage for the central theme of the tutorial namely, systematic approaches to resource constrained embedded system design based on the notion of complexity scalability. This will drive dynamic (run-time) resource adaptation and joint optimization of the application and the implementation. We will discuss the construction of complexity scalable multimedia compression algorithms and highlight the trade-offs one can make between resource utilization and a user-defined utility function. Next, we will provide the details of embedded system architecture with emphasis on processor design and operating system support for emerging multimedia applications. We will conclude with open problems and directions for future work.

 

Motivation, Objectives, and Outline

State-of-the-art multimedia compression and streaming technology can now enable a variety of delay-sensitive applications, such as videoconferencing, emergency services, surveillance, telemedicine, remote teaching and training, augmented reality and distributed gaming. However, efficiently designing and implementing state-of-the-art multimedia applications on resource-constrained and heterogeneous embedded devices is very challenging due to the following reasons:

 

  1. The complexity associated with state-of-the-art multimedia encoding and decoding algorithms is high. By complexity we mean resources required to execute a given computation. Resources include hardware resources such as memory, CPU time, functional units, instruction and data memory bandwidths, as well as power dissipation (battery life). Recent compression algorithms such as the H.264 and MPEG-21 Scalable Video Coding standard, wavelet video compression, etc. achieve breakthroughs in terms of rate-distortion (R-D) performance at the expense of a tremendous increase in complexity as compared to older coding schemes such as MPEG-1 or MPEG-2.
  2. The encoding and decoding complexity depend on the multimedia content characteristics and the compression factors (bit-rates) that may be derived based on the time-varying network/channel conditions. In our previous work we have shown that the critical component (i.e. most complex function) in a multimedia compression system varies depending on the multimedia characteristics and bit-rates at which the data is encoded. For instance, for the same CIF-resolution video sequence, the entropy decoding takes only 10% of the total processing time at low-rates (200kbps), and almost 80% at the higher rates (1000kbps). Also, depending on the video characteristics and chosen algorithm implementation, the percentage of time spent for motion estimation can vary from less than 40% to almost 90%.
  3. With the advent of Internet/web TV, on-line gaming, videoconferencing and other emerging applications, multiple complex multimedia processing/compression tasks need to be executed simultaneously and share the available resources of the embedded systems.
  4. Many emerging multimedia applications require real-time encoding and decoding. Very stringent delay constraints Ð delays of less than 200 milliseconds are required for interactive applications, such as videoconferencing, surveillance etc., while for multimedia streaming applications delays of no more than 1-2s are tolerable. Hence, multimedia data needs to be processed in a delay-sensitive manner.

 

These challenges are being addressed by researchers in a diverse range of communities such as system architecture, embedded system design, operating systems, multimedia compression and communications, real-time systems, video signal processing and multimedia applications. The goal of this tutorial is to bring the exciting new developments taking place in these disciplines under one umbrella to foster cross-pollination of ideas and catalyze synergistic inter-disciplinary research for next generation multimedia systems.

 

After taking this course, attendees should gain a comprehensive understanding of challenges in embedded system design for multimedia applications and the solutions in the horizon. In particular, systems-oriented people will be exposed to the opportunities that exist in the compression algorithms and video signal processing to improve their design and conversely multimedia software and applications oriented people will gain an understanding of the advances that have been made in real-time systems, middleware and programmable hardware for efficient implementation of multimedia systems.

 

Please follow this link for more details and information regarding this tutorial.  

 

The outline of the tutorial is as follows:

 

  1. Overview of Multimedia Algorithms, Applications and their Requirements
  2. Rate, Distortion, Complexity Scalability - Theory and Practice
  3. Modeling and Abstracting Complexity
  4. Constructing Complexity Scalable Algorithms
  5. Embedded Systems Architecture - dynamic voltage, frequency and resource scaling, and opportunities for reconfigurable logic
  6. Systematic Design Methodologies for Co-processors - formal Models, hardware/software partitioning and codesign
  7. Operating Systems support for Multimedia Applications
  8. Cross Layer Strategies for Energy Minimization of typical multimedia applications on general purpose processors

 

Target audience

The course is intended for professionals and researchers in multimedia communication systems and embedded system design with interest in emerging trends in video compression, computer architecture and system design.

 

Biographies of the instructors

Venkatesh Akella (http://www.ece.ucdavis.edu/~akella) received his PhD degree from the department of Computer Science at University of Utah and a Masters degree in Electrical and Communication Engineering from the Indian Institute of Science, Bangalore. He is an Associate Professor of Electrical & Computer Engineering at the University of California in Davis. He has over 12 years of experience and has published over 50 refereed papers on various aspects of computer systems architecture, electronic design automation, low power design, system-level design methodologies, reconfigurable computing, asynchronous design, design verification and more recently in resource management and optimization in embedded systems. He was a visiting faculty at Hewlett Packard, in their networking division where he advised them on verification of advanced ASICs used in fibre-channel data storage products and a consultant to Silicon Automation Systems, where he developed a design methodology for implementing multimedia applications on Sharp's programmable video processor. He was a key member of a silicon-valley startup that developed a novel programmable platform for high performance and low power wireless and multimedia applications. He was the chief architect of the hardware architecture and the software design tools. In 1999, he was a Visiting Professor at Indian Institute of Management in Bangalore (IIM, Bangalore) where he conducted research in software management and financial models for quantifying the benefits of design reuse. Prof. Akella received the National Science Foundation CAREER Award and is currently a principal investigator or co-principal investigator on five NSF grants in the area of computer architecture, optical networking, error correction codes, software system architecture for sensor networks and programmable edge routers.

 

Mihaela van der Schaar (http://www.ece.ucdavis.edu/~mihaela) received her PhD degree in electrical engineering from Eindhoven University of Technology, the Netherlands. She is currently an Assistant Professor in the Electrical and Computer Engineering Department at University of California, Davis. Between 1996 and June 2003, she was a senior member research staff at Philips Research in the Netherlands and USA. At Philips, she led the research activity on adaptive video coding and streaming over Internet and wireless networks, and was also involved in the research of low-cost very high quality video compression techniques and their implementation for TV, computer and camera systems. Since 1999, she is an active participant to the MPEG-4 standard, contributing to the scalable video coding activities, and she was also a co-editor of the MPEG-4 "Fine Granularity Scalability" standard. She is currently the chair of the MPEG adhoc group on Scalable Video Coding and co-chair of Multimedia Streaming test-bed group. She gave numerous tutorials in the area of scalable video coding, multimedia networking and architectures at different IEEE conferences and also for Philips Center of Technical Training. In 2003, she was also an Adjunct Professor at Columbia University. She chaired and organized numerous special sessions in the area of multimedia compression, streaming and architectures and was a guest editor of the EURASIP Special issue on multimedia over IP and wireless networks, March 2004 and the General Chair of Picture Coding Symposium 2004. Her research interests are in multimedia networking, compression and architectures. She co-authored more than 100 book chapters and papers and holds 15 patents. She was elected as a Member of the Technical Committee on Multimedia Signal Processing of the IEEE Signal Processing Society and is an associate editor of IEEE Transactions on Multimedia, associate editor of IEEE Circuits and Systems for Video Technology and SPIE Journal of Optical Engineering. She is also a Senior Member of IEEE. Prof. van der Schaar received the National Science Foundation CAREER Award.

 

Rajesh Gupta (http://www.cs.ucsd.edu/~gupta/) is a professor and holder of the Qualcomm endowed chair in embedded microsystems in the Department of Computer Science & Engineering at UC San Diego, California. He received his BTech in Electrical Engineering from IIT Kanpur, India, MS in EECS from UC Berkeley and a Ph. D. in Electrical Engineering from Stanford University. His current research interests are in embedded systems, VLSI design and adaptive system architectures. Earlier he was on the faculty of Computer Science departments at UC Irvine and University of Illinois, Urbana-Champaign. Prior to that, he worked as a circuit designer at Intel Corporation in Santa Clara, California on a number of processor design teams. He is author/co-author of over 150 articles on various aspects of embedded systems and design automation and three patents on PLL design, data-path synthesis and system-on-chip modeling. Gupta is a recipient of the Chancellor's Fellow at UC Irvine, UCI Chancellor's Award for excellence in undergraduate research, National Science Foundation CAREER Award, two Departmental Achievement Awards and a Components Research Team Award at Intel. Gupta is editor-in-chief of IEEE Design & Test of Computers and serves on the editorial boards of IEEE Transactions on CAD and IEEE Transactions on Mobile Computing. Gupta is a Fellow of the IEEE and a distinguished lecturer for the ACM/SIGDA and the IEEE CAS Society.

 

Please visit the instructors' websites for more details on their experience and expertise.

 


 

Multimedia Collaboration: Systems & Technologies

 

Instructor:

Dr. Yong Rui

Microsoft Research, Redmond, USA

 

Abstract

Multimedia communication and collaboration has become one of the most active research areas in the past few years.  It facilitates students to attend classes from remote, and it can greatly increase information worker’s productivity.  In this tutorial, we will cover both example collaboration systems to motivate the work, and the underlying technologies to drill deep into fundamental research problems. Specifically, for systems, we will cover a) an automated lecture capture system, b) RingCam: a 360-degree meeting recording system, and c) a real-time room conferencing system with live whiteboard capture.  For the technologies, we will cover microphone array sound source localization, real-time person tracking, and probabilistic sensor fusion for speaker tracking using particle filters.

 

Outline

 

1.      (First hour): Scenarios and Systems

1.1.   Overview: The importance and context for multimedia collaboration.

1.2.   Four scenarios and three systems:

1.2.1.     An automated lecture room

This is a system that can record and broadcast lectures and presentations fully automatically, by using state-of-the-art computer vision based person tracking (for the presenter), microphone sound source localization (for audience), and virtual director rules. Its quality is approaching that of a human camera operator.

1.2.2.     RingCam 360-degree meeting recording system (with demo)

RingCam is a device that captures 360-degree field-of-view of a meeting room, when placed on the meeting room table. It also has multiple microphones built into the device base. It supports rich off-line meeting viewing experience by providing active speaker view, an event-based timeline of the meeting and time compression for audio speed-up.

1.2.3.     Real-time room conferencing system and live whiteboard capture (with video)

When people are attending meetings from remote, they may not hear clearly, may not see the person they want to see, and tend to be ignored by the people in the local meeting room. This system gives remote participants a sense of “being there” by using a remote-person stand-in device. In addition, the live whiteboard capture capability brings physical whiteboards into the digital world by using computer-vision classification techniques. With this functionality, one can write on his/her regular physical whiteboard, and other meeting participants can see a digital whiteboard without the person in front of it.

 

2.   (Second hour): Audio Technologies

2.1.   Overview on audio capture, e.g., beam forming, and audio sound source localization (SSL).

2.2.   Single-pair SSL

We discuss and show a new SSL weighting function, based on the generalized maximum likelihood principle that simultaneously handles ambient noise and room reverberation.

2.3.   Multi-pair SSL – direct methods for robust SSL estimation

The conventional SSL methods use two steps.  The first step estimates each pair’s bearing angle using time-delay algorithms, and the second step intersects multiple bearing angles to form the final angle estimation.  Because this two-step process throws away important information, its performance is not the best.  We will show several direct methods that solve the SSL problem in one step, thus significantly improving the estimation accuracy and robustness.

 

3.   (Third hour): Video and Sensor Fusion Technologies

3.1.   Computer vision based person tracking

3.1.1.     Hidden Markov model (HMM) contour tracking (intra-frame)

Observations of the HMM are collected along the normal lines of the object contour.  The states are the positions along the normal lines. State transitions are used to model contour smoothness constraints and the optimal contour is obtained by Viterbi decoding.

3.1.2.     Unscented Kalman Filter (UKF) tracking (inter-frame)

Kalman filter (KF) provides the optimal solution for linear and Gaussian dynamic systems. When a system is non-linear, which is the case in real life, extended Kalman filter (EKF) is normally used to linearize the system. But there is a better solution, called unscented Kalman filter (UKF), based on the elegant unscented transformation. UKF approximates higher orders of the Taylor expansion, thus achieving more accurate tracking results.

3.1.3.     On-line adaptation

HMM handles intra-frame contour tracking, and UKF handles inter-frame tracking.  Another important component is on-line adaptation, as the object and/or environment change appearance constantly.  This is achieved by using the soft decisions obtained in the Viterbi decoding process in 3.1.1.

3.2.   Sensor fusion

Here we talk about how to fuse the tracking results from both SSL and person tracking to achieve more robust speaker detection.

3.2.1.     Basics of particle filters

In 3.1.2, we know that KF is optimal for linear and Gaussian systems, and EKF and UKF can deal with non-linear (but Gaussian) systems.  But in real life, many systems are both non-linear and non-Gaussian.  Particle filters is an effective tool to cope with that.  It uses weighted samples (particles) to estimate the posterior probability.

3.2.2.     Speaker tracking using particle filter sensor fusion

Discuss how to design the proposal function for the particle filter and how to estimate the weights for each individual sensor.

 

Intended audience

Researchers, students and practitioners in the field of multimedia collaboration, audio processing and computer vision, and anyone who wants to learn cool systems and technologies in these areas.

 

Biography of the instructor

Yong Rui is a Researcher in and the manager of the Multimedia Collaboration team at Microsoft Research Redmond. Dr. Rui is a Senior Member of IEEE and a Member of ACM. He is an Editor of ACM/Springer Multimedia Systems Journal, an Associate Editor of IEEE Transaction on Multimedia, and on the editorial board of International Journal of Multimedia Tools and Applications.  He received his PhD from University of Illinois at Urbana-Champaign (UIUC).

 

Dr. Rui’s research interests include computer vision, signal processing, machine learning, and their applications in communication, collaboration, and multimedia systems.  He has published one book (Exploration of Visual Data, Kluwer Academic Publishers), six book chapters, and over sixty referred journal and conference papers in the above areas.  Dr. Rui was on Organizing Committees and Program Committees of ACM Multimedia, IEEE CVPR, IEEE ECCV, IEEE ACCV, IEEE ICIP, IEEE ICASSP, IEEE ICME, SPIE ITCom, ICPR, CIVR, among others. He is a Program Chair of Int. Conf. Image and Video Retrieval (CIVR) 2006, a Program Area Chair of ICME 2002 and ICME 2005, and Program Co-Chair of IEEE International Workshop on Multimedia Technologies in E-Learning and Collaboration (WOMTEC) 2003. He was on NSF review panel and National Academy of Engineering's Symposium on Frontiers of Engineering for outstanding researchers.

 

Dr. Rui’s gives many public speeches at conferences, tradeshows, and internal training sessions. His tutorial on “Multimedia Collaboration” at Pacific-Rim Multimedia (PCM) 2004 is one of the highest rated tutorials.

 

Publication list: http://www.research.microsoft.com/~yongrui/html/publication.html

 

Professional activity list: http://www.research.microsoft.com/~yongrui/html/activity.html

 


 

Human-Centered Multimedia Information Systems

 

Instructors:

Dr. Nicu Sebe

Faculty of Science,

University of Amsterdam

Netherlands

Email: nicu@science.uva.nl

URL: http://www.science.uva.nl/~nicu

 

Dr. Alejandro (Alex) Jaimes

FXPAL Japan, Corporate Research Group,

Fuji Xerox Co., Ltd.

Japan

Email: alex.jaimes@fujixerox.co.jp

 

Abstract

This tutorial will take a holistic view on the research issues and applications of Human-Centered Multimedia Information Systems focusing on three main areas: (1) multimedia data: conceptual analysis at different levels e.g., feature, cognitive, and affective; (2) indexing algorithms: context modeling, cultural issues, and machine learning for user-centric approaches; (3) multimodal interaction: vision (body, gaze, gesture) and audio (emotion) analysis.

 

Motivation, Objectives, and Outline

Multimedia lies at the crossroads of many research areas (psychology, artificial intelligence, HCI, etc.) and is used in a wide range of applications. In particular, there are many applications in which humans directly interact with multimedia data (Human-Centered-Multimedia Information Systems).

 

On one hand, the fact that computers are quickly becoming integrated into everyday objects (ubiquitous and pervasive computing) implies that effective natural human-computer interaction is becoming critical (in many applications, users need to be able to interact naturally with computers the way face-to-face human-human interaction takes place). On the other hand, the wide range of applications that use multimedia, and the amount of multimedia content currently available, imply that building successful multimedia applications requires a deep understanding of multimedia content.

 

The success of human-centered-multimedia information systems, therefore, depends highly on two joint aspects: (1) the human factors that pertain to multimedia data (human subjectivity, levels of interpretation), and (2) the way humans interact naturally with such systems (using speech and body language) to express emotion, mood, attitude, and attention.

 

In this tutorial, we take a holistic approach to the human-centered-multimedia information systems problem. We aim to identify the important research issues, and to ascertain potentially fruitful future research directions in relation to the two aspects above. In particular, we introduce key concepts, discuss technical approaches and open issues in three areas: (1) multimedia data: conceptual analysis at different levels, e.g., feature, cognitive, and affective; (2) indexing algorithms: machine learning for user-centric approaches; and (3) interaction: multimodal interaction in multimedia systems.

 

The focus of the tutorial, therefore, is not on multimedia content or technologies, but rather on technical approaches formulated from the perspective of key human factors in a user-centered approach to developing Human-Centered-Multimedia Information Systems.

 

This tutorial will enable the participants to understand key concepts, state-of-the-art techniques, and open issues in the following areas:

 

 

Intended audience

The tutorial is intended for PhD students, scientists, engineers, application developers, computer vision specialists and others interested in the areas of multimedia information retrieval and human-computer interaction. A basic understanding of image processing and machine learning is a prerequisite.

 

Biography of the instructor

Nicu Sebe is an assistant professor in the Faculty of Science, University of Amsterdam, The Netherlands, where he is doing research in the areas of multimedia information retrieval and human-computer interaction in computer vision applications. He is the author of the book Robust Computer Vision—Theory and Applications (Kluwer, April 2003) and of the upcoming book Machine Learning in Computer Vision (Springer, Spring 2005). He was a guest editor of a CVIU special issue on video retrieval and summarization (December 2003) and was the co-chair of ACM Multimedia Information Retrieval Workshops, MIR’03 & MIR'04 (in conjunction with ACM Multimedia conferences). He also was the co-chair of the first Human Computer Interaction Workshop, HCI ’04 (in conjunction with ECCV 2004) and is the co-chair of the upcoming IEEE Workshop on Human computer Interaction Workshop (in conjunction with ICCV 2005). He is the guest editor of two special issues on multimedia information retrieval and human computer interaction in ACM Multimedia Systems journal and Image and Vision Computing Journal. He was the technical program chair for the International Conference on Image and Video Retrieval, CIVR 2003. He was a visiting researcher in the Beckman Institute, University of Illinois at Urbana-Champaign (2002) and was a research fellow of the British Telecomm in Ipswich (2003). He has published more than 50 technical papers in the areas of computer vision, content-based retrieval, pattern recognition, and human-computer interaction and has served on the program committee of several conferences in these areas. He is a member of the IEEE and the ACM.

 

Alejandro Jaimes is an Advanced Multimedia Specialist at Fuji Xerox's FXPAL Japan (in Nakai) where he leads the efforts in Multimedia Analysis and Interaction. His current research focuses on using Computer Vision techniques for Multimedia Analysis and Interaction. Dr. Jaimes received a Ph.D. in Electrical Engineering (2003) and a M.S. in Computer Science from Columbia University (1997) in New York City. He holds a Computing Systems Engineering degree from Universidad de los Andes (1994) in Bogota, Colombia. Prior to joining the Ph.D. program at Columbia he was a member of Columbia's Robotics and Computer Graphics groups, where he worked on projects related to computer vision and computer graphics. He has held summer research positions at AT&T Bell Laboratories, Siemens Corporate Research, and IBM (TJ Watson and Tokyo Research Laboratories). His recent professional activities include co-chairing the ACM Multimedia 2005 and 2004 Interactive Art program, the PCM 2004 special session on “Immersive Conferencing: Novel Interfaces and Paradigms for Remote Collaboration” and the ICME 2004 special session on “Novel Techniques for browsing in Large Multimedia Collections.”  He is also co-founder and co-chair of the Workshop on Technology for Education in Developing Countries (TEDC ’05, TEDC ’04, TEDC ‘03), and serves as the TPC member for several international conferences (ICME, ICIP, CIVR, ICCV and CVPR Workshops on HCI, etc.), among others. His work has led to over 30 technical publications in international conferences and journals, and to numerous contributions to the MPEG-7 standard. He has 5 patents pending. He is a member of the IEEE and ACM.

 


 

Multimedia Processing on Multiprocessor SoC Platforms: What should Multimedia System Developers know about Architectural Design, Performance Analysis and Platform Management?

 

Instructors:

Radu Marculescu

Carnegie Mellon University

 

Samarjit Chakraborty

National University of Singapore

 

Paul Stravers

Philips Research, The Netherlands

 

Abstract

Multimedia applications represent the predominant workload in many embedded devices from set-top boxes to mobile phones and PDAs. Most of these devices are now designed using generic platform architectures rather than starting from scratch. Examples of such multimedia-centric platforms include Eclipse and CAKE from Philips or OMAP from Texas Instruments. These platforms are based on heterogeneous multiprocessor architectures which are challenging to design, prototype, and implement. As a result, significant research efforts have been recently directed towards (i) platform analysis and optimization techniques for multimedia applications and (ii) application development for System-on-Chip (SoC) multimedia implementations. This tutorial will provide a comprehensive overview of these recent developments and provide the audience a “walk-through” into this emerging research area, with emphasis towards real problems and pragmatic, easy-to-implement solutions for engineers and embedded software developers specializing in multimedia applications.

 

Outline

Recently, there has emerged a considerable interest in generic and configurable System-on-Chip (SoC) platforms targeted towards multimedia applications. Such platforms are designed for a particular application domain and support sufficient flexibility to allow (re)configurability for several products belonging to that domain. Examples of such platforms are the Eclipse and the CAKE SoC architecture from Philips which target consumer electronics media. Designs based on such generic platforms are associated with flexibility, low costs and time-to-market advantages. However, unfortunately, there exists a large disparity in performance between generic platform-based designs and fully customized solutions based on traditional ASIC implementations. Consequently, significant research efforts are currently directed towards devising appropriate platform design, configuration and management techniques to narrow this gap.

 

In this tutorial, we plan to address these issues one-by-one. Firstly, we will review some real platforms targeted towards multimedia applications. Secondly, we shall discuss the modeling techniques relevant to multimedia processing on multiprocessor architectures. Lastly, we plan to show how these models can be used to quantitatively analyze different architectures and use compositional techniques to reason about timing, quality-of-service, memory requirements, communication infrastructure and media quality tradeoffs.

 

The proposed material spans from basic techniques to advanced research issues. The state-of-the-art techniques today used in industry to address the design problems outlined above rely heavily on simulation. While useful, such techniques have several drawbacks including, most notably, prohibitively long simulation times. This tutorial will present practical, but non-trivial, solutions to overcome these drawbacks.

 

Intended audience

This tutorial is intended for multimedia applications developers, as well as researchers and students interested in getting an overview of recent developments in the area of multimedia processing on generic platform architectures. The emphasis is on performance, power analysis and platform-management techniques. The presentation is intended for those with a multimedia background, with or without an additional background on basic VLSI and design automation techniques. The goal is to bring together multimedia systems designers and application developers and offer them a perspective on recent developments and critical issues which will shape next-generation multimedia processing platforms.

 

Biographies of the instructors

Radu Marculescu is an Associate Professor in the Dept. of Electrical & Computer Engineering at Carnegie Mellon University. His current research activities focus on system-level design methodologies, multimedia and ambient intelligent systems. He received two best paper awards at the 2001 and 2003 editions of the Design Automation & Test in Europe (DATE) Conference and one best paper award at the Asia & South Pacific Design Automation Conference in 2003. Dr. Marculescu is co-founder of the workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia) and will serve as workshop General Chair in 2005. Dr. Marculescu was the Guest Co-Editor of a Special Issue on “Designing Real-Time Embedded Multimedia Systems,” published by IEEE Design & Test of Computers in Sept./Oct. 2004.

 

Samarjit Chakraborty is an Assistant Professor in the Dept. of Computer Science at the National University of Singapore. He obtained his Ph.D. from ETH Zurich in 2003. For his Ph.D. thesis, he received the ETH Medal and the European Design and Automation Association’s (EDAA) “Outstanding Doctoral Dissertation Award” in 2004. Dr. Chakraborty’s research interests are primarily in the area of system-level design of real-time and embedded systems, with a focus on architectures for multimedia applications. He has  recently served on the technical program committee of the IEEE Real-Time Systems Symposium (RTSS 2004) and is currently on the program committees of a number of conferences, including the Euromicro  Conference on Real-Time Systems (ECRTS 2005) and the IEEE Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2005).

 

Paul Stravers is computer architect with Philips since 1995. He graduated cum laude from Delft University of Technology in 1989 and earned a Ph.D. in electrical engineering from the same university in 1994. He spent several years in Silicon Valley designing embedded MIPS processors, but he is mostly known for his role as the architect of Philips' next generation systems-on-chip: embedded homogeneous multiprocessors for high-powered software-centered media applications.