Last modified: Mon Sep 8 1997
Multimedia server taxonomy and evaluation
by Tobias Öbrink , Department of Teleinformatics, KTH
A presentation of activities in WP7 for MERCI meeting in INRIA Sophia
Antipolis April 1997.
Table of Contents
Digital audio, video and computer supported cooperation tools are HYPE. VoD-, and Multimedia Server servers are sprouting up everywhere like weeds. There are several presentation languages and distributed programming languages waiting for streaming material to display in flashy shows on ITV, networked kiosks, home computers, you name it. Every platform vendor have presented their own high performance solution, and a lot of heavyweight players like f.ex. Microsoft, Netscape and Progressive Networks have launched their own products. The IETF and ITU-T works hard to keep up with the current avalanche-like development and tries to cooperate with eachother as well as with the commercial vendors to produce working standards.
During the last year I have been collecting information about research projects and commercial development efforts as well as the standardisation efforts related to networked digital multimedia in general, and MoD in particular. I have a lot of information, but no common structure in which to present it. When I look at all these announcements of new products and all new research papers on Multimedia Server systems, I wish I had some classification scheme to use in comparing the different solutions. Some general taxonomy that is not connected to a specific product or research prototype.
We will develop a taxonomy over Multimedia Server-related elements to help in comparing different Multimedia Server solutions with regard to
- Relevant Standards
- Functionality
- System Design
- Performance
In parallell to this effort, I will continue to collect information about -, and test Multimedia servers to produce a survey over existing solutions using the above taxonomy.It will be an iterative process where the two parts will interactively complement each other.
The survey will ofcourse relatively soon become outdated, but my hope is that the taxonomy will have a more lasting value.
A preliminary draft of a taxonomy can be found in Appendix A. Here's a short summary:
- Relevant Standards, de facto and formal, in the following areas
- Coding and Compression
- Session Control and Signalling
- Transport
- User Interface
- Interoperability support
- Security
- Functionality
- Main field of application
- Extensibility
- Playback
- Recording
- Editing
- Admission control
- System design
- Client
- Server
- Communication
- Storage strategy
- Hardware solution
- Performance
- Network related
- Media related
- Hardware
- User friendliness
A lot of information, but so far in a quite unstructured state and a lot of dangling loose ends can be found in Appendix B. Here's a listing of the Multimedia servers included so far:
- SGI WebFORCE and MediaBase
- XMovie project, University of Mannheim
- Voyager project, Argonne National Laboratory, Illinois
- MBone-VCR on demand Service, University of Mannheim
- Oracle Video Client/Server
- Sun MediaCenter
- The RealMedia Model, Progressive Networks
- StreamWorks, Xing Technologies
- VXtreme client/server
- The multicast Media-On-Demand/mMOD system, Luleå Tekniska Universitet
- VivoActive from Vivo Software
- IP/TV from Precept Software
- NetShow 2.0 from Microsoft
- VDOLive from VDONet Corporation
- Vosaic
- Video Conference Recorder (VCR) from UCL
An digital ecoding of some information, either digitized or computer-generated. The Media Types are
- Text
- Graphics
- Audio
- Video
A "stream" of packets containing data encoding of one media type.
A collection of streams originating from and operated by a "participant" .
The host from which a specific Stream origins. There can be several Participants sending from the same Source.
One or more Participants participating in a conference, lecture, call
Some Multimedia servers have limitations as to the encoding format of the Streams it plays.
RTSP - Replay session control and signalling. Either interleaved with the playback stream or separate.
SIP - Either for recording or Indirect playback
- SAP - Multicast address allocation
- SIP - Inviting Participants
- RTP - Multicast address/port pair collision recovery
Playback Streams by IP multicast and unicast, both reliable and unreliable. RTP/RTCP, UDP, TCP, SRM. nt, et.c.
Session Description by HTTP, FTP, SIP, SAP or some other way.
Many Multimedia servers use Tcl/Tk-based GUI, others use WWW
IP, RTP, H.32x - Standards for ensuring interoperation over different networks, platforms, codecs
Insecure network, end-points, proxies and mirrors.
Lecture-on-demand, news-on-demand, movie-on-demand, conference recorder, network-based multimedia presentation.
API's, modules for extending the servers Functionality, Interoperability, Security.
Direct playback - A new session is created by the Multimedia-server for playback.
Indirect playback - The Multimedia-server joins an already existing conference.
Search and Browse recorded Sessions, Participants, Streams.
Browse active session playbacks.
Start, kill, set speed, direction, jump to bookmark.
Visual Fast forward and Rewind by creating new indexes
Floor control for user interaction.
Explicitly invite a list of participants
Set a minimal playback session scope (TTL)
Replicate original sender information in the playback session.
- Reception time stamps
- Sender time stamps
- Burst transmission
- Indexed playback with "artificial" timestamps created at editing time
Browse active sessions, participants, streams and "point-and-click" those you want to record (Using RTCP or SDP,SAP,SIP, IPheader)
Delayed recording. Instruct the Multimedia server to start recording at a certain time.
Use RTP sequence numbers to reconstruct incoming streams by
- reordering out.of.order packets
- Detect duplicates
Edit recorded material.
Filter/merge and other post-processing of Sessions, Participants, Streams and sub-packet level (block, sample. object)
Orchestration (creating "artificial" time-stamps) use the "atomic" time steps of the level 0 index or creating new Streams based on processing on sub-packet level.
Analysis tools to help editing and statistics.
Editing quality and other aspects of media coding.
Eliminate initial silence in Streams, Participants, Sessions.
Create layered encodings from non-layered originals.
The Multimedia server should have some admission control scheme for access to the server.
The User interaction channel should be secured.
Client/Server with powerful, heavy servers with lots of resources, minimal client programs and playback tools.
Distributed Multimedia server system with lots of small distributed servers using a common frontend or a lot of frontends using a common communication channel.
Which applications are compatible for use to playback replayed streams.
Handles all direct user interaction with the Multimedia server
For Direct playback to a multicast address we need to assign one or more unique multicast address/port pairs. Such a mechanism is supported by
- listening for Session Announcement Protocol (SAP) packets on a certain multicast address/port.
- listening to a random multicast address and use it if no traffic is detected
- support the RTP/RTCP address/port pair collision recovery
The Playback initiator optionally supplies a list of Participants to receive the Playback. These can be invited using the Session Invitation Protocol provided they have tools handling this protocol.
In the case no support for SIP can be depended upon, a Session Description Protocol packet can be generated and sent to the Playback initiator. It's then up to the Playback initiator to contact the other Participants.
Using the list of Participants given by the Playback initiator a minimal scope (TTL) for the Playback session can be computed (by f.ex. using traceroute)
Using a Session Description Protocol packet given by the Playback initiator, the same TTL can be used for the Playback session.
Optionally a TTL can be given by the Playback initiator.
of Streams, Participants, Sources, and Sessions. Using unique sender ID's in RTCP sender reports or UDP/IP IPaddress/port pairs and payload-specific information.
Sender information exist in the vat protocol, and RTP. This information or part of it can be played back to create a more or less genuine replay of the original Session also providing awareness-related- and other information.
To be able to correctly record a Session including tools using reliable multicast, the Multimedia-server must Participate actively. F.ex request retransmission of lost packets. This means that the Multimedia server must support the protocols used by the applications.
A single entrypoint for Playback, Recording, Editing, Admission control.
A common interface for all functionality (except playback?), preferably graphical and WWW-based.
High interactivity reduce scalability gains.
How to manage multicast addresses to take advantage of that a few users are looking at the same clip.
The scalability gain appears close to the receivers/sender.
Handling multicast addresses is (so far) more computationally expensive than ordinary unicast dito.
In the case that we have a "moderator" and a bunch of "listeners" we still have use of multicast delivery.
Realtime-reliability tradeoffs based on retransmission is "unbalanced" in a 1:N-system.
Realtime-reliability tradeoffs based on buffering at the receiver side seems to help a bit.
A lower-level (link or network layer) reliability guarantee (QoS) seems to be the best alternative.
Use TTL-scope. The Playback initiator controls the size of the potential audience by setting TTL to a suitable value.
Use encryption (which will affect real-time performance).
Either for recording or Indirect playback. Uses SIP.
Internet is the target. therefore support for layered coding and transmission is needed.
Store data by
- Session
- Participant
- Stream
and use indexes for arbitrary access of streams
- Packet contents
- Headers
- Time of arrival
- Payload information
- Source address
- RTCP source info
- Inter-stream synchronizing information
- References to streams
- Multicast addresses used
- Reference to Participants
- Inter-participant synchronization information
- Indexes into streams (level 0 index)
- reference to data (byte offset)
- arrival timestamp
- reference to Meta-data (F.ex a text, or HTML description)
- Bookmarks into level 0 indexes
- Arbitrary index levels supported
Different solutions are better for different media-types
- Raw partition - No limit in file size
- File system
- Relational DBMS
- Object-oriented DBMS
If the Multimedia server depends on certain hardware configuration(s).
- Max Number of Users
- Max Number of Streams
- Max output bandwidth
- Max Number of video Frames per second in a Stream
- Max Size in pixels of a video Frame
- Max Number of Users
- Max Number of Streams
- Max output bandwidth
Consumers are mostly humans, therefore we get the following constraints:
- More than one media at the same time. I.e. multiple different constraints on receiver buffering delay, loss and jitter.
Media |
Buffering Delay |
Jitter sensitive |
Loss sensitive |
Audio |
Long |
Little |
Very |
Video |
Very short |
Little |
Not |
Other data |
Medium |
Not |
Very |
- Random access and recording. Highly interactive. I.e. roundtrip-time constraints on interactions.
- Person-to-system. I.e. less roundtrip-time constraints on replay. People tend to be more patient when dealing with machines.
With these constraints in mind we see that a good Multimedia-server should behave as following:
- Have a relatively long initial delay before replay when data is buffered at the receiver
- Prioritize the replay of media at the receiver in the following order: Video, Audio, Data
- Prioritize the transfer and reception of media in the following order: Interactions, Audio, Data, Video.
- Buffer a little Audio, and much Data before replay at the receiver.
Integrated Web-media servers that deliver interactive, real-time, high-quality MPEG1 and MPEG2 video and audio streams to Web clients via IP and ATM networks.
- Content Management
- Scalable Media Delivery Services (the video pump)
- Web Integration
- High-Performance Networking
- LAN Multicasting
- Storage Management
- Operations and Management Tools
- Cosmo(TM) StreamPlayer Video Client
28.8Kbps to 8Mbps bandwidth for Internet and intranet applications Cosmo MediaBase 2.0 will run on the Origin(TM) systems. http://www.sgi.com/Products/WebFORCE/Products/Mediabase/
see http://www.informatik.uni-mannheim.de/~keller/XMovie/XMovie.html
A CMClient (Continuous Media client) in the Xserver-machine using MTP (Movie Transmission Protocol) on TCP and UDP to communicate with a Video server that's located somewhere else. The X11 protocol is extended to handle continuous media streams. The CMClient uses a shared memory buffer and the extended X11 to c ommunicate with the Xserver. The CMClient uses AF (Audio File) protocol and the extended X11 to communicate with the AudioFile server that also resides in the Xserver-machine. The playback application uses MCAM (Movie Control, Access and Management) protocol to communicate with the CMClient.
The Xserver doesn't have to reside in the same machine as the playback application and the X11 need less bandwidth for CM. This is a good solution for viewing video streams on dumb Xterminals (No disk, very little memory).
Of course neither the local network nor the Xserver machine can cope with more than 2 - 3 CMClients at a time and a few more player applications (watching the same stream), because all decoding have to be done in the CMClient, the Xserver have to mediate the full video frames to the playback application. And the AudioFile server?
But can this extended X11 support sharing of X applications?
The Voyager system has a distributed client-server architecture, where the server side consists of a set of cgi-scripts for session browsing, recording, and playback; a pair of lowlevel daemons for record and playback; a Perl-based architecture for gluing the layers together (using nPerl for interdaemon communication (1)), and a relational database to act as a central repository of informationfor the system as a whole. The design goal is to serve large numbers of high-bandwidth streams (nominal 5Mbps or more) both in record and playback mode.
We use the IBM TigerShark multimedia filesystem on the server machine, in order to do striping across multiple disk server nodes in the SP2 we're using as a server.
On the client side we just use the mbone tools -- vic/vat or the Precept tools.
We demonstrated an earlier version of Voyager at Supercomputing '95 -- see http://www.iway.org/video/index.html for a little info; information on the current version is available at http://voyager.mcs.anl.gov/Voyager/.
(1) Nexus/Perl -- see http://www.mcs.anl.gov/nexus/nperl/
A while ago I implemented one of the recording and playback applications on the MBone, the MBone-VCR, but unfortunately it is a little outdated since I didn't have enouguh time to look after it after I left ICSI (where I developed it). Anyhow, some ideas were still in my mind and together with two students in Mannheim we now could start a new project a couple of weeks ago that we called: "MBone-VCR on demand Service"
Well, we just started, so there are lots of things we have to investigate further, but since John brought the issue up, I thought this might be a good point to throw our ideas in the pot as well...
A brief overview of our project goals so far:
- develop a distributed client-server architecture where MBone-VCR clients request recordings andplaybacks from MBone-VCR server.
- clients will be java-applets so they can be viewed and run with any java-capable browser on any platform. The clients take care to launch the appropriae applications to view the recordings. The information which the aproriate appplication comes from the server. The interaction between clients and servers will be via remote objects.
- server will be partially implemented as java-applications (not applets!) and partially using C++. The interface to the client will be implemented in java (so we can use the same RO-stack), time critical issues like retrieving RTP- packets from the net and storing them on a vcr-file will be implemented in C++ and bound to the java application as shared library. The server will implement an interface to sdap in order to be able to receive session announcments (which the server will offer the clients as possible candidates that they can record) and to announce playbacks requested from clients.
- define a protocol between clients and servers that will provide:
- session control and management e.g.:
- listing of available session announcments
- listing of available recordings
- scheduling of recordings
- scheduling of playbacks
- change parameters of a session (e.g. lower the ttl)
- stream control and managment e.g.
- play, stop, pause, record, ff, rew, set index, jump to next index, ...
- events like "end of recording", "sync" or "update timing"
- floor control
- allow only one client to control the session while multiple clients may receive it.
- allow passing of the floor control token
- define a standardized rtp-file format to store rtp-packets:
- The file format should be:
- media independant
- self describing
- interchangable
- efficient
- and optionally support
- fast random access
- indexing
- editing
There are a lot more things to mention but I think you get the idea and we have enough to talk and discuss about. We are at the very beginning of the project and would be happy to have lively discussion and new ideas that we may include in the architecture.
Contact:
Wieland Holfelder
University of Mannheim Praktische Informatik IV
Email: whd@pi4.informatik.uni-mannheim.de L 15,16
Fax : +49-621-292-5745 68131 Mannheim
Phone: +49-621-292-3300 Germany
http://www.informatik.uni-mannheim.de/~whd
This is a short summary from the Oracle Video server manual. We haven't been able to verify the information in the manual yet because of lack of funds and personell to do the installation and run the tests.
Platform:
PC/Windows 3.11
API:
16bit OLE API, for ex. Visual Basic 4.0.
Format:
MPEG-1
Comment:
Oracle Media Net uses UDP/IP, so let's imp a new client
Platform:
Sun, HP/Unix
API:
System Services API, C
Formats:
MPEG-1 implemented
Supports storing of Quicktime, MPEG-1 & 2, TrueMotion, DVI, JPEG, et.c
Can be expanded through System Services API, C
Features:
Multi-user,
Striping & RAID Disk access
Comments:
I like the disk handling. The API is bearable.
Performance not affected by disk fragmentation.
Overview:
API:
Media Net RPC, Media Net XDR, Semaphores, Media Net Transport Layer API, System Services API, C
Protokoll:
UDP/IP
Possible to expand to RTP/UDP/IP for ex.
Features:
Oracle Media Net Logger Process
Oracle Media Net Name Server (Servers and clients have their own unique name space built on top of underlying transport layer(UDP))
Oracle Media Net Address Server (Maps Media Net addresses to corresponding
physical addresses (UDP == physical address?))
Oracle Media Net Process Server (Typical portmap-process)
Connection Service (???)
A typical convenience functions library in C. Cannot see any objects though.
1) Total Bandwidth demand?
bw_demand = #streams * average_bitrate * 1.2
^
Adm. loss. junk
2) FDDI network interface
30 streams @ 2 Mbps
40 streams @ 1.5 Mbps
3) Max output from Sparc center 1000
4st FDDI-cards:
120 streams @ 2 Mbps => 240 Mbps
bw_demand = 240 * 1.2 = 288 Mbps
Needs approx. 256 MB RAM?
4) Media clips and all that
Is supported.
5) Videopump
Max # video streams = 20 (MPEG version)
Max # video streams from Oracle Video Server, an implementation of
Oracle Media Server I think. We probably can write own media pump
programs for the Media Server that can encode/transmit other than
MPEG over UDP
6) RTP MPEG-1 & 2 payloads defined
7) RTP JPEG payloads defined
JPEG also supported in the System Services API of the Media Server
according to the documentation.
8) RTP over UDP/IP
Because RTP has
- Sequence numbering: Ignore out-of-date packets and copies
- RTP routers terminates network loops
- Conference control and management (Also supported by the OMN unfortunately)
- Time stamps: Connection quality and delay
- Data rate- and routing negotiation possible via RTCP
- Implementations exists
Can use System Services API to encapsulate any format in RTP, but how handle RTCP messaging?
9) Do students have to run their own Mrouted at home to be able to receive?
10) How many ISDN BRI or Modem 28.8 connections on 240 Mbps?
11) Media Net RPC
Is it implemented or not?
- run-time lib
- interface def. language
- error handling
- RPC compiler (Not yet?)
12) Oracle Media Server usability
Oracle Video Server and Oracle Media Net.
Can use on LAN for a few streams, but we have to modify to other format
and bitrates.
For Internet use and ISDN/Modem users we need alot of modifications.
Current Server user- and programmer interface is horrible.
Disk- and request handling in the server is good.
KTH/Teleinformatics have got a Sun MediaCenter 20 for evaluation. This is another video server platform equipped with three * 100Mbps Ethernet and ability to stream MPEG over UDP/IP. The client run on Sun/Solaris and PC.
http://www.sun.com/products-n-solutions/hw/servers/smc_external.html
We have six elements in our current model: Servers, Splitters, License Servers, System Managers, Players, and Encoders. These elements form a tree heirarchy for any given stream. Each stream has an identifier (URL) which identifies the server originating it and the object in that server's namespace (the RealAudio file). Both live and on-demand streams (the former originating from live encoders and the latter originating from files) appear in a single heirarchical namespace maintained by the server.
The protocol used, RTSP, is currently under standardisation in the IETF MMUSIC WG.
See also http://www.real.com/products/server/index.html for a list over Streaming Media Servers.
Stereo audio at 28.8, near-CD quality at higher bitrates, AM-quality audio at 14.4. Newscast-quality video at 28.8 and full-motion at higher bitrates.
Player currently available for Windows95/NT and Macintosh PowerPC. In the future also available for
Linux 1.2.x with an OSS (Voxware) 3.0 sound driver
Linux 2.0.x with an OSS 3.5.x sound driver on a Pentium machine
Solaris 2.5
Solaris 2.4
SunOS 4.1.x
Irix 5.3
FreeBSD 2.0
See http://www.real.com/sysreq.html for detailed Player system requirements.
is currently available for Windows 95/NT. RealVideo Encoder plug-in for Adobe Premiere is available for Macintosh Power PC, and comes with the Windows encoder.
See http://www.real.com/products/encoder/realvideo/sysreq.html for detailed RealVideo Encoder system requirements.
is currently available for Windows 95/NT, Macintosh PowerPC, Linux, HP/UX, AIX, FreeBSD, Irix and Digital Unix platforms. RealAudio Encoder 3.0 beta is currently available for Solaris 2.5. RealAudio Encoder 2.0 is still available for users with a Macintosh 68040 with FPU, and for users with SunOS 4.1. The RealAudio Xtras for SoundEdit 16 are currently available only with RealAudio Encoder 2.0.
See http://www.real.com/products/encoder/realaudio/sysreq.html for detailed RealAudio Encoder system requirements.
You first need a StreamWorks Encoder/Transmitter. It is a PC box we ship to your address DHL. It comes either in a PC Box form or a rack-mount configuration. It enables you to make your source streams from standard AV inputs (RCA, etc.) A full audio/video system costs $6,500. It is fully PAL compatible. The Audio and Video Transmitter is the one you would probably utilize for your audio and video broadcasts.
In order to send streams across the internet or LANs or WANs one needs our StreamWorks Server, which is software that you download from our site here that sits on your web server machine, UNIX box, or whatever you are currently using. Our products handle everything from 8.5 kbps (which is AM quality audio (mono) all the way up to 112 kbps (which handles CD quality audio, or 30 frames per second of video MPEG-1 (VHS quality) with AM type audio on top of that.
A PC application based on a proprietary codec running over modem speeds (14.4 kbps, 28.8 kbps).
http://ctrl.cdt.luth.se/~peppar/progs/mMOD/
MBone audio, video, whiteboard, NetText, mWeb or any other UDP multicast or unicast session can be recorded and played back in 2 modes; raw UDP or parsed RTP. In parsed RTP recording it checks for duplicates & out-of-order packets and rearrange packets based on sequence numbers. All packets get a timestamp when recorded.
The mMOD system consists of distributed stand-alone VCR-programs and distributed Web controller-interfaces.
Messages are sent on a special multicast/port-pair or TCP/port.
Multicasted messages:
alive_message = All active VCR-programs respond with its unique ID, session name, owner, playing/recording, media and its control-port number.
Multicasted or direct messages:
stop, pause, continue, skip, rewind, jump
No group control of a multicasted session possible. Only the owner controls a session.
The mMOD is completely written in Java v1.0.2 and is available for SUN/Solaris and Windows95/NT4.0.
is a stand-alone program and also includes a minimal HTTP-server listening on the control-port. This enables 1) a Java-client that is directly connected to the VCR 2) use of the mWeb slide-show presentation feature
uses a MIME-type application/x-sdp and SDP-files for launching appropriate helper applications on the client-side. The Web-controller has CGI-based and Java user interfaces.
Windows 95 or NT, PowerMac. H.263 video compression and G.723 audio compression. Once the original .AVI file is converted to a .VIV file, the Web site developer simply uses the HTML <EMBED> command to embed the VIVO file into the Web page, the same way he or she would embed a .GIF or .JPG file. Free VivoActive Players for a variety of browsers and platforms (plug-ins for Netscape Navigator on the Macintosh, Windows 95, Window NT, Windows 3.1, and an ActiveX control for Microsoft Internet Explorer on Windows 95 or NT). The video is streamed from the Web server via HTTP over TCP, just like the rest of the page.
http://hydra.precept.com/products/ip-tv.htm
IP/TV uses RTP over standard IP Multicast to deliver full-screen, full-motion video to desktop PCs over private IP routed networks and the Internet. IP/TV has three elements:
- the Program Guide, which schedules programs, controls user access and manages bandwidth usage. Runs on any NT or Unix WWW-server.
- the Video Server, which delivers live or prerecorded video from such devices as cameras and VCRs according to schedules in the Program Guide
- the Viewer, which presents the user with a list of scheduled programs.
Microsoft NetShow 2.0 provides an easy, powerful way to stream multimedia across the Internet and intranets. It is a complete, high-performance system for broadcasting live and on-demand audio, video and mixed multimedia. NetShow technology is built on open industry standards. NetShow supports the ASF file format, which is capable of using media in MOV, AVI, WAV, BMP, DIB, JPEG and other standard formats. NetShow also supports the H.263 video standard, as well as transport and protocol standards such as UDP/IP, TCP/IP, HTTP, RTP and IP multicast. NetShow 2.0 beta software is available now without charge from the Microsoft Web site at http://www.microsoft.com/netshow/
Stuart Clayman, A Video Conference Recorder - Design and implementation, Department of Computer Science, University College London, December 1995.
Record, and Play UDP, TCP and IP packets. Telnet-based user interface.
No data yet
No data yet
Maintained by
Tobias Öbrink